Compare commits
8 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
3a6f79c9a8 | ||
|
|
60f18f8e71 | ||
|
|
47f11b3d95 | ||
|
|
f24be8ff53 | ||
|
|
dfff432321 | ||
|
|
d59bf02d08 | ||
|
|
345a5888d9 | ||
|
|
d6a5771e01 |
@@ -102,3 +102,10 @@ TEST_HASS_HOST=http://localhost:8123
|
|||||||
TEST_HASS_TOKEN=test_token
|
TEST_HASS_TOKEN=test_token
|
||||||
TEST_HASS_SOCKET_URL=ws://localhost:8123/api/websocket
|
TEST_HASS_SOCKET_URL=ws://localhost:8123/api/websocket
|
||||||
TEST_PORT=3001
|
TEST_PORT=3001
|
||||||
|
|
||||||
|
# Speech Features Configuration
|
||||||
|
ENABLE_SPEECH_FEATURES=false
|
||||||
|
ENABLE_WAKE_WORD=true
|
||||||
|
ENABLE_SPEECH_TO_TEXT=true
|
||||||
|
WHISPER_MODEL_PATH=/models
|
||||||
|
WHISPER_MODEL_TYPE=base
|
||||||
381
README.md
381
README.md
@@ -1,231 +1,288 @@
|
|||||||
# Model Context Protocol (MCP) Server for Home Assistant
|
# 🚀 MCP Server for Home Assistant - Bringing AI-Powered Smart Homes to Life!
|
||||||
|
|
||||||
The Model Context Protocol (MCP) Server is a robust, secure, and high-performance bridge that integrates Home Assistant with Language Learning Models (LLMs), enabling natural language control and real-time monitoring of your smart home devices. Unlock advanced automation, control, and analytics for your Home Assistant ecosystem.
|
[](LICENSE)
|
||||||
|
[](https://bun.sh)
|
||||||

|
[](https://www.typescriptlang.org)
|
||||||

|
[](#)
|
||||||

|
|
||||||

|
|
||||||
[](https://jango-blockchained.github.io/homeassistant-mcp/)
|
[](https://jango-blockchained.github.io/homeassistant-mcp/)
|
||||||
|
[](https://www.docker.com)
|
||||||
|
|
||||||
## Table of Contents
|
---
|
||||||
|
|
||||||
- [Overview](#overview)
|
## Overview 🌐
|
||||||
- [Key Features](#key-features)
|
|
||||||
- [Architecture & Design](#architecture--design)
|
|
||||||
- [Installation](#installation)
|
|
||||||
- [Basic Setup](#basic-setup)
|
|
||||||
- [Docker Setup (Recommended)](#docker-setup-recommended)
|
|
||||||
- [Usage](#usage)
|
|
||||||
- [API & Documentation](#api--documentation)
|
|
||||||
- [Development](#development)
|
|
||||||
- [Roadmap & Future Plans](#roadmap--future-plans)
|
|
||||||
- [Community & Support](#community--support)
|
|
||||||
- [Contributing](#contributing)
|
|
||||||
- [Troubleshooting & FAQ](#troubleshooting--faq)
|
|
||||||
- [License](#license)
|
|
||||||
|
|
||||||
## Overview
|
Welcome to the **Model Context Protocol (MCP) Server for Home Assistant**! This robust platform bridges Home Assistant with cutting-edge Language Learning Models (LLMs), enabling natural language interactions and real-time automation of your smart devices. Imagine entering your home, saying:
|
||||||
|
|
||||||
The MCP Server bridges Home Assistant with advanced LLM integrations to deliver intuitive control, automation, and state monitoring. Leveraging a high-performance runtime and real-time communication protocols, MCP offers a seamless experience for managing your smart home.
|
> “Hey MCP, dim the lights and start my evening playlist,”
|
||||||
|
|
||||||
## Key Features
|
and watching your home transform instantly—that's the magic that MCP Server delivers!
|
||||||
|
|
||||||
### Device Control & Monitoring
|
---
|
||||||
- **Smart Device Control:** Manage lights, climate, covers, switches, sensors, media players, fans, locks, vacuums, and cameras using natural language commands.
|
|
||||||
- **Real-time Updates:** Receive instant notifications and updates via Server-Sent Events (SSE).
|
|
||||||
|
|
||||||
### System & Automation Management
|
## Key Benefits ✨
|
||||||
- **Automation Engine:** Create, modify, and trigger custom automation rules with ease.
|
|
||||||
- **Add-on & Package Management:** Integrates with HACS for deploying custom integrations, themes, scripts, and applications.
|
|
||||||
- **Robust System Management:** Features advanced state monitoring, error handling, and security safeguards.
|
|
||||||
|
|
||||||
## Architecture & Design
|
### 🎮 Device Control & Monitoring
|
||||||
|
- **Voice-Controlled Automation:**
|
||||||
|
Use simple commands like "Turn on the kitchen lights" or "Set the thermostat to 22°C" without touching a switch.
|
||||||
|
**Real-World Example:**
|
||||||
|
In the morning, say "Good morning! Open the blinds and start the coffee machine" to kickstart your day automatically.
|
||||||
|
|
||||||
The MCP Server is built with scalability, resilience, and security in mind:
|
- **Real-Time Communication:**
|
||||||
|
Experience sub-100ms latency updates via Server-Sent Events (SSE) or WebSocket connections, ensuring your dashboard is always current.
|
||||||
|
**Real-World Example:**
|
||||||
|
Monitor energy usage instantly during peak hours and adjust remotely for efficient consumption.
|
||||||
|
|
||||||
- **High-Performance Runtime:** Powered by Bun for fast startup, efficient memory utilization, and native TypeScript support.
|
- **Seamless Automation:**
|
||||||
- **Real-time Communication:** Employs Server-Sent Events (SSE) for continuous, real-time data updates.
|
Create scene-based rules to synchronize multiple devices effortlessly.
|
||||||
- **Modular & Extensible:** Designed to support plugins, add-ons, and custom automation scripts, allowing for easy expansion.
|
**Real-World Example:**
|
||||||
- **Secure API Integration:** Implements token-based authentication, rate limiting, and adherence to best security practices.
|
For movie nights, have MCP dim the lights, adjust the sound system, and launch your favorite streaming app with just one command.
|
||||||
|
|
||||||
For a deeper dive into the system architecture, please refer to our [Architecture Documentation](docs/architecture.md).
|
### 🤖 AI-Powered Enhancements
|
||||||
|
- **Natural Language Processing (NLP):**
|
||||||
|
Convert everyday speech into actionable commands—just say, "Prepare the house for dinner," and MCP will adjust lighting, temperature, and even play soft background music.
|
||||||
|
|
||||||
## Installation
|
- **Predictive Automation & Suggestions:**
|
||||||
|
Receive proactive recommendations based on usage habits and environmental trends.
|
||||||
|
**Real-World Example:**
|
||||||
|
When home temperature fluctuates unexpectedly, MCP suggests an optimal setting and notifies you immediately.
|
||||||
|
|
||||||
### Basic Setup
|
- **Anomaly Detection:**
|
||||||
|
Continuously monitor device activity and alert you to unusual behavior, helping prevent malfunctions or potential security breaches.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architectural Overview 🏗
|
||||||
|
|
||||||
|
Our architecture is engineered for performance, scalability, and security. The following Mermaid diagram illustrates the data flow and component interactions:
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
subgraph Client
|
||||||
|
A[Client Application<br/>(Web / Mobile / Voice)]
|
||||||
|
end
|
||||||
|
subgraph CDN
|
||||||
|
B[CDN / Cache]
|
||||||
|
end
|
||||||
|
subgraph Server
|
||||||
|
C[Bun Native Server]
|
||||||
|
E[NLP Engine<br/>& Language Processing Module]
|
||||||
|
end
|
||||||
|
subgraph Integration
|
||||||
|
D[Home Assistant<br/>(Devices, Lights, Thermostats)]
|
||||||
|
end
|
||||||
|
|
||||||
|
A -->|HTTP Request| B
|
||||||
|
B -- Cache Miss --> C
|
||||||
|
C -->|Interpret Command| E
|
||||||
|
E -->|Determine Action| D
|
||||||
|
D -->|Return State/Action| C
|
||||||
|
C -->|Response| B
|
||||||
|
B -->|Cached/Processed Response| A
|
||||||
|
```
|
||||||
|
|
||||||
|
Learn more about our architecture in the [Architecture Documentation](docs/architecture.md).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Technical Stack 🔧
|
||||||
|
|
||||||
|
Our solution is built on a modern, high-performance stack that powers every feature:
|
||||||
|
|
||||||
|
- **Bun:**
|
||||||
|
A next-generation JavaScript runtime offering rapid startup times, native TypeScript support, and high performance.
|
||||||
|
👉 [Learn about Bun](https://bun.sh)
|
||||||
|
|
||||||
|
- **Bun Native Server:**
|
||||||
|
Utilizes Bun's built-in HTTP server to efficiently process API requests with sub-100ms response times.
|
||||||
|
👉 See the [Installation Guide](docs/getting-started/installation.md) for details.
|
||||||
|
|
||||||
|
- **Natural Language Processing (NLP) & LLM Integration:**
|
||||||
|
Processes and interprets natural language commands using state-of-the-art LLMs and custom NLP modules.
|
||||||
|
👉 Find API usage details in the [API Documentation](docs/api.md).
|
||||||
|
|
||||||
|
- **Home Assistant Integration:**
|
||||||
|
Provides seamless connectivity with Home Assistant, ensuring flawless communication with your smart devices.
|
||||||
|
👉 Refer to the [Usage Guide](docs/usage.md) for more information.
|
||||||
|
|
||||||
|
- **Redis Cache:**
|
||||||
|
Enables rapid data retrieval and session persistence essential for real-time updates.
|
||||||
|
|
||||||
|
- **TypeScript:**
|
||||||
|
Enhances type safety and developer productivity across the entire codebase.
|
||||||
|
|
||||||
|
- **JWT & Security Middleware:**
|
||||||
|
Protects your ecosystem with JWT-based authentication, request sanitization, rate-limiting, and encryption.
|
||||||
|
|
||||||
|
- **Containerization with Docker:**
|
||||||
|
Enables scalable, isolated deployments for production environments.
|
||||||
|
|
||||||
|
For further technical details, check out our [Documentation Index](docs/index.md).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Installation 🛠
|
||||||
|
|
||||||
|
### 🐳 Docker Setup (Recommended)
|
||||||
|
|
||||||
|
For a hassle-free, containerized deployment:
|
||||||
|
|
||||||
1. **Install Bun:** If Bun is not installed:
|
|
||||||
```bash
|
```bash
|
||||||
|
# 1. Clone the repository (using a shallow copy for efficiency)
|
||||||
|
git clone --depth 1 https://github.com/jango-blockchained/homeassistant-mcp.git
|
||||||
|
|
||||||
|
# 2. Configure your environment: copy the example file and edit it with your Home Assistant credentials
|
||||||
|
cp .env.example .env # Modify .env with your Home Assistant host, tokens, etc.
|
||||||
|
|
||||||
|
# 3. Build and run the Docker containers
|
||||||
|
docker compose up -d --build
|
||||||
|
|
||||||
|
# 4. View real-time logs (last 50 log entries)
|
||||||
|
docker compose logs -f --tail=50
|
||||||
|
```
|
||||||
|
|
||||||
|
👉 Refer to our [Installation Guide](docs/getting-started/installation.md) for full details.
|
||||||
|
|
||||||
|
### 💻 Bare Metal Installation
|
||||||
|
|
||||||
|
For direct deployment on your host machine:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Install Bun (if not already installed)
|
||||||
curl -fsSL https://bun.sh/install | bash
|
curl -fsSL https://bun.sh/install | bash
|
||||||
|
|
||||||
|
# 2. Install project dependencies with caching support
|
||||||
|
bun install --frozen-lockfile
|
||||||
|
|
||||||
|
# 3. Launch the server in development mode with hot-reload enabled
|
||||||
|
bun run dev --watch
|
||||||
```
|
```
|
||||||
|
|
||||||
2. **Clone the Repository:**
|
---
|
||||||
```bash
|
|
||||||
git clone https://github.com/jango-blockchained/homeassistant-mcp.git
|
|
||||||
cd homeassistant-mcp
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **Install Dependencies:**
|
## Real-World Usage Examples 🔍
|
||||||
```bash
|
|
||||||
bun install
|
|
||||||
```
|
|
||||||
|
|
||||||
4. **Build the Project:**
|
### 📱 Smart Home Dashboard Integration
|
||||||
```bash
|
Integrate MCP's real-time updates into your custom dashboard for a dynamic smart home experience:
|
||||||
bun run build
|
|
||||||
```
|
|
||||||
|
|
||||||
### Docker Setup (Recommended)
|
|
||||||
|
|
||||||
1. **Clone the Repository:**
|
|
||||||
```bash
|
|
||||||
git clone https://github.com/jango-blockchained/homeassistant-mcp.git
|
|
||||||
cd homeassistant-mcp
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **Configure Environment:**
|
|
||||||
```bash
|
|
||||||
cp .env.example .env
|
|
||||||
```
|
|
||||||
Customize the `.env` file with your Home Assistant configuration.
|
|
||||||
|
|
||||||
3. **Deploy with Docker Compose:**
|
|
||||||
```bash
|
|
||||||
docker compose up -d
|
|
||||||
```
|
|
||||||
- View logs: `docker compose logs -f`
|
|
||||||
- Stop the server: `docker compose down`
|
|
||||||
|
|
||||||
4. **Update the Application:**
|
|
||||||
```bash
|
|
||||||
git pull && docker compose up -d --build
|
|
||||||
```
|
|
||||||
|
|
||||||
## Usage
|
|
||||||
|
|
||||||
Once the server is running, open your browser at [http://localhost:3000](http://localhost:3000). For real-time device updates, integrate the SSE endpoint in your application:
|
|
||||||
|
|
||||||
```javascript
|
```javascript
|
||||||
const eventSource = new EventSource('http://localhost:3000/subscribe_events?token=YOUR_TOKEN&domain=light');
|
const eventSource = new EventSource('http://localhost:3000/subscribe_events?token=YOUR_TOKEN&domain=light');
|
||||||
|
|
||||||
eventSource.onmessage = (event) => {
|
eventSource.onmessage = (event) => {
|
||||||
const data = JSON.parse(event.data);
|
const data = JSON.parse(event.data);
|
||||||
console.log('Update received:', data);
|
console.log('Real-time update:', data);
|
||||||
|
// Update your UI dashboard, e.g., refresh a light intensity indicator.
|
||||||
};
|
};
|
||||||
```
|
```
|
||||||
|
|
||||||
## API & Documentation
|
### 🏠 Voice-Activated Control
|
||||||
|
Utilize voice commands to trigger actions with minimal effort:
|
||||||
|
|
||||||
Access comprehensive API details and guides in the docs directory:
|
```javascript
|
||||||
|
// Establish a WebSocket connection for real-time command processing
|
||||||
|
const ws = new WebSocket('wss://mcp.yourha.com/ws');
|
||||||
|
|
||||||
- **API Reference:** [API Documentation](docs/api.md)
|
ws.onmessage = ({ data }) => {
|
||||||
- **SSE Documentation:** [SSE API](docs/sse-api.md)
|
const update = JSON.parse(data);
|
||||||
- **Troubleshooting Guide:** [Troubleshooting](docs/troubleshooting.md)
|
if (update.entity_id === 'light.living_room') {
|
||||||
- **Architecture Details:** [Architecture Documentation](docs/architecture.md)
|
console.log('Adjusting living room lighting based on voice command...');
|
||||||
|
// Additional logic to update your UI or trigger further actions can go here.
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
## Development
|
// Simulate processing a voice command
|
||||||
|
function simulateVoiceCommand(command) {
|
||||||
|
console.log("Processing voice command:", command);
|
||||||
|
// Integrate with your actual voice-to-text system as needed.
|
||||||
|
}
|
||||||
|
|
||||||
### Running in Development Mode
|
simulateVoiceCommand("Turn off all the lights for bedtime");
|
||||||
|
|
||||||
```bash
|
|
||||||
bun run dev
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Running Tests
|
👉 Learn more in our [Usage Guide](docs/usage.md).
|
||||||
|
|
||||||
- Execute all tests:
|
---
|
||||||
```bash
|
|
||||||
bun test
|
|
||||||
```
|
|
||||||
|
|
||||||
- Run tests with coverage:
|
## Update Strategy 🔄
|
||||||
```bash
|
|
||||||
bun test --coverage
|
|
||||||
```
|
|
||||||
|
|
||||||
### Production Build & Start
|
Maintain a seamless operation with zero downtime updates:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
bun run build
|
# 1. Pull the latest Docker images
|
||||||
bun start
|
docker compose pull
|
||||||
|
|
||||||
|
# 2. Rebuild and restart containers smoothly
|
||||||
|
docker compose up -d --build
|
||||||
|
|
||||||
|
# 3. Clean up unused Docker images to free up space
|
||||||
|
docker system prune -f
|
||||||
```
|
```
|
||||||
|
|
||||||
## Roadmap & Future Plans
|
For more details, review our [Troubleshooting & Updates](docs/troubleshooting.md).
|
||||||
|
|
||||||
The MCP Server is under active development and improvement. Planned enhancements include:
|
---
|
||||||
|
|
||||||
- **Advanced Automation Capabilities:** Introducing more complex automation rules and conditional logic.
|
## Security Features 🔐
|
||||||
- **Enhanced Security Features:** Additional authentication layers, encryption enhancements, and security monitoring tools.
|
|
||||||
- **User Interface Improvements:** Development of a more intuitive web dashboard for easier device management.
|
|
||||||
- **Expanded Integrations:** Support for a wider array of smart home devices and third-party services.
|
|
||||||
- **Performance Optimizations:** Continued efforts to reduce latency and improve resource efficiency.
|
|
||||||
|
|
||||||
For additional details, check out our [Roadmap](docs/roadmap.md).
|
We prioritize the security of your smart home with multiple layers of defense:
|
||||||
|
- **JWT Authentication 🔑:** Secure, token-based API access to prevent unauthorized usage.
|
||||||
|
- **Request Sanitization 🧼:** Automatic filtering and validation of API requests to combat injection attacks.
|
||||||
|
- **Rate Limiting & Fail2Ban 🚫:** Monitors requests to prevent brute force and DDoS attacks.
|
||||||
|
- **End-to-End Encryption 🔒:** Ensures that your commands and data remain private during transmission.
|
||||||
|
|
||||||
## Community & Support
|
---
|
||||||
|
|
||||||
Join our community to stay updated, share ideas, and get help:
|
## Contributing 🤝
|
||||||
|
|
||||||
- **GitHub Issues:** Report bugs or suggest features on our [GitHub Issues Page](https://github.com/jango-blockchained/homeassistant-mcp/issues).
|
We value community contributions! Here's how you can help improve MCP Server:
|
||||||
- **Discussion Forums:** Connect with other users and contributors in our community forums.
|
1. **Fork the Repository 🍴**
|
||||||
- **Chat Platforms:** Join our real-time discussions on [Discord](#) or [Slack](#).
|
Create your own copy of the project.
|
||||||
|
2. **Create a Feature Branch 🌿**
|
||||||
## Contributing
|
|
||||||
|
|
||||||
We welcome your contributions! To get started:
|
|
||||||
|
|
||||||
1. Fork the repository.
|
|
||||||
2. Create your feature branch:
|
|
||||||
```bash
|
```bash
|
||||||
git checkout -b feature/your-feature-name
|
git checkout -b feature/your-feature-name
|
||||||
```
|
```
|
||||||
3. Install dependencies:
|
3. **Install Dependencies & Run Tests 🧪**
|
||||||
```bash
|
```bash
|
||||||
bun install
|
bun install
|
||||||
|
bun test --coverage
|
||||||
```
|
```
|
||||||
4. Make your changes and run tests:
|
4. **Make Your Changes & Commit 📝**
|
||||||
```bash
|
Follow the [Conventional Commits](https://www.conventionalcommits.org) guidelines.
|
||||||
bun test
|
5. **Open a Pull Request 🔀**
|
||||||
```
|
Submit your changes for review.
|
||||||
5. Commit and push your changes, then open a Pull Request.
|
|
||||||
|
|
||||||
For detailed guidelines, see [Contributing Guide](docs/contributing.md).
|
Read more in our [Contribution Guidelines](docs/contributing.md).
|
||||||
|
|
||||||
## Troubleshooting & FAQ
|
---
|
||||||
|
|
||||||
### Common Issues
|
## Roadmap & Future Enhancements 🔮
|
||||||
|
|
||||||
- **Connection Problems:** Ensure that your `HASS_HOST`, authentication token, and WebSocket URL are correctly configured.
|
We're continuously evolving MCP Server. Upcoming features include:
|
||||||
- **Docker Deployment:** Confirm that Docker is running and that your `.env` file contains the correct settings.
|
- **AI Assistant Integration (Q4 2024):**
|
||||||
- **Automation Errors:** Verify entity availability and review your automation configurations for potential issues.
|
Smarter, context-aware voice commands and personalized automation.
|
||||||
|
- **Predictive Automation (Q1 2025):**
|
||||||
|
Enhanced scheduling capabilities powered by advanced AI.
|
||||||
|
- **Enhanced Security (Q2 2024):**
|
||||||
|
Introduction of multi-factor authentication, advanced monitoring, and rigorous encryption methods.
|
||||||
|
- **Performance Optimizations (Q3 2024):**
|
||||||
|
Reducing latency further, optimizing caching, and improving load balancing.
|
||||||
|
|
||||||
For more troubleshooting details, refer to [Troubleshooting Guide](docs/troubleshooting.md).
|
For more details, see our [Roadmap](docs/roadmap.md).
|
||||||
|
|
||||||
### Frequently Asked Questions
|
---
|
||||||
|
|
||||||
**Q: What platforms does MCP Server support?**
|
## Community & Support 🌍
|
||||||
|
|
||||||
A: MCP Server runs on Linux, macOS, and Windows (Docker is recommended for Windows environments).
|
Your feedback and collaboration are vital! Join our community:
|
||||||
|
- **GitHub Issues:** Report bugs or request features via our [Issues Page](https://github.com/jango-blockchained/homeassistant-mcp/issues).
|
||||||
|
- **Discord & Slack:** Connect with fellow users and developers in real-time.
|
||||||
|
- **Documentation:** Find comprehensive guides on the [MCP Documentation Website](https://jango-blockchained.github.io/homeassistant-mcp/).
|
||||||
|
|
||||||
**Q: How do I report a bug or request a feature?**
|
---
|
||||||
|
|
||||||
A: Please use our [GitHub Issues Page](https://github.com/jango-blockchained/homeassistant-mcp/issues) to report bugs or request new features.
|
## License 📜
|
||||||
|
|
||||||
**Q: Can I contribute to the project?**
|
This project is licensed under the MIT License. See [LICENSE](LICENSE) for full details.
|
||||||
|
|
||||||
A: Absolutely! We welcome contributions from the community. See the [Contributing](#contributing) section for more details.
|
---
|
||||||
|
|
||||||
## License
|
🔋 Batteries included.
|
||||||
|
|
||||||
This project is licensed under the MIT License. See [LICENSE](LICENSE) for the full license text.
|
|
||||||
|
|
||||||
## Documentation
|
|
||||||
|
|
||||||
Full documentation is available at: [https://jango-blockchained.github.io/homeassistant-mcp/](https://jango-blockchained.github.io/homeassistant-mcp/)
|
|
||||||
|
|
||||||
## Quick Start
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
## Usage
|
|
||||||
41
docker/speech/Dockerfile
Normal file
41
docker/speech/Dockerfile
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
FROM python:3.10-slim
|
||||||
|
|
||||||
|
# Install system dependencies
|
||||||
|
RUN apt-get update && apt-get install -y \
|
||||||
|
git \
|
||||||
|
build-essential \
|
||||||
|
portaudio19-dev \
|
||||||
|
python3-pyaudio \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Install fast-whisper and its dependencies
|
||||||
|
RUN pip install --no-cache-dir torch torchaudio --index-url https://download.pytorch.org/whl/cpu
|
||||||
|
RUN pip install --no-cache-dir faster-whisper
|
||||||
|
|
||||||
|
# Install wake word detection
|
||||||
|
RUN pip install --no-cache-dir openwakeword pyaudio sounddevice
|
||||||
|
|
||||||
|
# Create directories
|
||||||
|
RUN mkdir -p /models /audio
|
||||||
|
|
||||||
|
# Download the base model by default
|
||||||
|
# The model will be downloaded automatically when first used
|
||||||
|
ENV ASR_MODEL=base.en
|
||||||
|
ENV ASR_MODEL_PATH=/models
|
||||||
|
|
||||||
|
# Create wake word model directory
|
||||||
|
# Models will be downloaded automatically when first used
|
||||||
|
RUN mkdir -p /models/wake_word
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Copy the wake word detection script
|
||||||
|
COPY wake_word_detector.py .
|
||||||
|
|
||||||
|
# Set environment variables
|
||||||
|
ENV WHISPER_MODEL_PATH=/models
|
||||||
|
ENV WAKEWORD_MODEL_PATH=/models/wake_word
|
||||||
|
ENV PYTHONUNBUFFERED=1
|
||||||
|
|
||||||
|
# Run the wake word detection service
|
||||||
|
CMD ["python", "wake_word_detector.py"]
|
||||||
173
docker/speech/wake_word_detector.py
Normal file
173
docker/speech/wake_word_detector.py
Normal file
@@ -0,0 +1,173 @@
|
|||||||
|
import os
|
||||||
|
import json
|
||||||
|
import queue
|
||||||
|
import threading
|
||||||
|
import numpy as np
|
||||||
|
import sounddevice as sd
|
||||||
|
from openwakeword import Model
|
||||||
|
from datetime import datetime
|
||||||
|
import wave
|
||||||
|
from faster_whisper import WhisperModel
|
||||||
|
|
||||||
|
# Configuration
|
||||||
|
SAMPLE_RATE = 16000
|
||||||
|
CHANNELS = 1
|
||||||
|
CHUNK_SIZE = 1024
|
||||||
|
BUFFER_DURATION = 30 # seconds to keep in buffer
|
||||||
|
DETECTION_THRESHOLD = 0.5
|
||||||
|
|
||||||
|
# Wake word models to use
|
||||||
|
WAKE_WORDS = ["hey_jarvis", "ok_google", "alexa"]
|
||||||
|
|
||||||
|
# Initialize the ASR model
|
||||||
|
asr_model = WhisperModel(
|
||||||
|
model_size_or_path=os.environ.get('ASR_MODEL', 'base.en'),
|
||||||
|
device="cpu",
|
||||||
|
compute_type="int8",
|
||||||
|
download_root=os.environ.get('ASR_MODEL_PATH', '/models')
|
||||||
|
)
|
||||||
|
|
||||||
|
class AudioProcessor:
|
||||||
|
def __init__(self):
|
||||||
|
# Initialize wake word detection model
|
||||||
|
self.wake_word_model = Model(
|
||||||
|
custom_model_paths=None, # Use default models
|
||||||
|
inference_framework="onnx" # Use ONNX for better performance
|
||||||
|
)
|
||||||
|
|
||||||
|
# Pre-load the wake word models
|
||||||
|
for wake_word in WAKE_WORDS:
|
||||||
|
self.wake_word_model.add_model(wake_word)
|
||||||
|
|
||||||
|
self.audio_buffer = queue.Queue()
|
||||||
|
self.recording = False
|
||||||
|
self.buffer = np.zeros(SAMPLE_RATE * BUFFER_DURATION)
|
||||||
|
self.buffer_lock = threading.Lock()
|
||||||
|
|
||||||
|
def audio_callback(self, indata, frames, time, status):
|
||||||
|
"""Callback for audio input"""
|
||||||
|
if status:
|
||||||
|
print(f"Audio callback status: {status}")
|
||||||
|
|
||||||
|
# Convert to mono if necessary
|
||||||
|
if CHANNELS > 1:
|
||||||
|
audio_data = np.mean(indata, axis=1)
|
||||||
|
else:
|
||||||
|
audio_data = indata.flatten()
|
||||||
|
|
||||||
|
# Update circular buffer
|
||||||
|
with self.buffer_lock:
|
||||||
|
self.buffer = np.roll(self.buffer, -len(audio_data))
|
||||||
|
self.buffer[-len(audio_data):] = audio_data
|
||||||
|
|
||||||
|
# Process for wake word detection
|
||||||
|
prediction = self.wake_word_model.predict(audio_data)
|
||||||
|
|
||||||
|
# Check if wake word detected
|
||||||
|
for wake_word in WAKE_WORDS:
|
||||||
|
if prediction[wake_word] > DETECTION_THRESHOLD:
|
||||||
|
print(f"Wake word detected: {wake_word} (confidence: {prediction[wake_word]:.2f})")
|
||||||
|
self.save_audio_segment(wake_word)
|
||||||
|
break
|
||||||
|
|
||||||
|
def save_audio_segment(self, wake_word):
|
||||||
|
"""Save the audio buffer when wake word is detected"""
|
||||||
|
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||||
|
filename = f"/audio/wake_word_{wake_word}_{timestamp}.wav"
|
||||||
|
|
||||||
|
# Save the audio buffer to a WAV file
|
||||||
|
with wave.open(filename, 'wb') as wf:
|
||||||
|
wf.setnchannels(CHANNELS)
|
||||||
|
wf.setsampwidth(2) # 16-bit audio
|
||||||
|
wf.setframerate(SAMPLE_RATE)
|
||||||
|
|
||||||
|
# Convert float32 to int16
|
||||||
|
audio_data = (self.buffer * 32767).astype(np.int16)
|
||||||
|
wf.writeframes(audio_data.tobytes())
|
||||||
|
|
||||||
|
print(f"Saved audio segment to {filename}")
|
||||||
|
|
||||||
|
# Transcribe the audio
|
||||||
|
try:
|
||||||
|
segments, info = asr_model.transcribe(
|
||||||
|
filename,
|
||||||
|
language="en",
|
||||||
|
beam_size=5,
|
||||||
|
temperature=0
|
||||||
|
)
|
||||||
|
|
||||||
|
# Format the transcription result
|
||||||
|
result = {
|
||||||
|
"text": " ".join(segment.text for segment in segments),
|
||||||
|
"segments": [
|
||||||
|
{
|
||||||
|
"text": segment.text,
|
||||||
|
"start": segment.start,
|
||||||
|
"end": segment.end,
|
||||||
|
"confidence": segment.confidence
|
||||||
|
}
|
||||||
|
for segment in segments
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
# Save metadata and transcription
|
||||||
|
metadata = {
|
||||||
|
"timestamp": timestamp,
|
||||||
|
"wake_word": wake_word,
|
||||||
|
"wake_word_confidence": float(prediction[wake_word]),
|
||||||
|
"sample_rate": SAMPLE_RATE,
|
||||||
|
"channels": CHANNELS,
|
||||||
|
"duration": BUFFER_DURATION,
|
||||||
|
"transcription": result
|
||||||
|
}
|
||||||
|
|
||||||
|
with open(f"{filename}.json", 'w') as f:
|
||||||
|
json.dump(metadata, f, indent=2)
|
||||||
|
|
||||||
|
print("\nTranscription result:")
|
||||||
|
print(f"Text: {result['text']}")
|
||||||
|
print("\nSegments:")
|
||||||
|
for segment in result["segments"]:
|
||||||
|
print(f"[{segment['start']:.2f}s - {segment['end']:.2f}s] ({segment['confidence']:.2%})")
|
||||||
|
print(f'"{segment["text"]}"')
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error during transcription: {e}")
|
||||||
|
metadata = {
|
||||||
|
"timestamp": timestamp,
|
||||||
|
"wake_word": wake_word,
|
||||||
|
"wake_word_confidence": float(prediction[wake_word]),
|
||||||
|
"sample_rate": SAMPLE_RATE,
|
||||||
|
"channels": CHANNELS,
|
||||||
|
"duration": BUFFER_DURATION,
|
||||||
|
"error": str(e)
|
||||||
|
}
|
||||||
|
with open(f"{filename}.json", 'w') as f:
|
||||||
|
json.dump(metadata, f, indent=2)
|
||||||
|
|
||||||
|
def start(self):
|
||||||
|
"""Start audio processing"""
|
||||||
|
try:
|
||||||
|
print("Initializing wake word detection...")
|
||||||
|
print(f"Loaded wake words: {', '.join(WAKE_WORDS)}")
|
||||||
|
|
||||||
|
with sd.InputStream(
|
||||||
|
channels=CHANNELS,
|
||||||
|
samplerate=SAMPLE_RATE,
|
||||||
|
blocksize=CHUNK_SIZE,
|
||||||
|
callback=self.audio_callback
|
||||||
|
):
|
||||||
|
print("\nWake word detection started. Listening...")
|
||||||
|
print("Press Ctrl+C to stop")
|
||||||
|
|
||||||
|
while True:
|
||||||
|
sd.sleep(1000) # Sleep for 1 second
|
||||||
|
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
print("\nStopping wake word detection...")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error in audio processing: {e}")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
processor = AudioProcessor()
|
||||||
|
processor.start()
|
||||||
@@ -4,6 +4,9 @@ gem "github-pages", group: :jekyll_plugins
|
|||||||
gem "jekyll-theme-minimal"
|
gem "jekyll-theme-minimal"
|
||||||
gem "jekyll-relative-links"
|
gem "jekyll-relative-links"
|
||||||
gem "jekyll-seo-tag"
|
gem "jekyll-seo-tag"
|
||||||
|
gem "jekyll-remote-theme"
|
||||||
|
gem "jekyll-github-metadata"
|
||||||
|
gem "faraday-retry"
|
||||||
|
|
||||||
# Windows and JRuby does not include zoneinfo files, so bundle the tzinfo-data gem
|
# Windows and JRuby does not include zoneinfo files, so bundle the tzinfo-data gem
|
||||||
# and associated library.
|
# and associated library.
|
||||||
@@ -15,3 +18,6 @@ end
|
|||||||
# Lock `http_parser.rb` gem to `v0.6.x` on JRuby builds since newer versions of the gem
|
# Lock `http_parser.rb` gem to `v0.6.x` on JRuby builds since newer versions of the gem
|
||||||
# do not have a Java counterpart.
|
# do not have a Java counterpart.
|
||||||
gem "http_parser.rb", "~> 0.6.0", :platforms => [:jruby]
|
gem "http_parser.rb", "~> 0.6.0", :platforms => [:jruby]
|
||||||
|
|
||||||
|
# Add webrick for Ruby 3.0+
|
||||||
|
gem "webrick", "~> 1.7"
|
||||||
@@ -1,20 +0,0 @@
|
|||||||
# Home Assistant MCP Documentation
|
|
||||||
|
|
||||||
Welcome to the documentation for the Home Assistant MCP (Model Context Protocol) Server. Here you'll find comprehensive guides on setup, configuration, usage, and contribution.
|
|
||||||
|
|
||||||
## Quick Navigation
|
|
||||||
|
|
||||||
- [Getting Started Guide](getting-started.md)
|
|
||||||
- [API Documentation](api.md)
|
|
||||||
- [Troubleshooting](troubleshooting.md)
|
|
||||||
- [Contributing Guide](contributing.md)
|
|
||||||
|
|
||||||
## Repository Links
|
|
||||||
|
|
||||||
- [GitHub Repository](https://github.com/jango-blockchained/homeassistant-mcp)
|
|
||||||
- [Issue Tracker](https://github.com/jango-blockchained/homeassistant-mcp/issues)
|
|
||||||
- [GitHub Discussions](https://github.com/jango-blockchained/homeassistant-mcp/discussions)
|
|
||||||
|
|
||||||
## License
|
|
||||||
|
|
||||||
This project is licensed under the MIT License. See [LICENSE](../LICENSE) for details.
|
|
||||||
@@ -2,9 +2,24 @@ title: Model Context Protocol (MCP)
|
|||||||
description: A bridge between Home Assistant and Language Learning Models
|
description: A bridge between Home Assistant and Language Learning Models
|
||||||
theme: jekyll-theme-minimal
|
theme: jekyll-theme-minimal
|
||||||
markdown: kramdown
|
markdown: kramdown
|
||||||
|
|
||||||
|
# Repository settings
|
||||||
|
repository: jango-blockchained/advanced-homeassistant-mcp
|
||||||
|
github: [metadata]
|
||||||
|
|
||||||
|
# Add base URL and URL settings
|
||||||
|
baseurl: "/advanced-homeassistant-mcp" # the subpath of your site
|
||||||
|
url: "https://jango-blockchained.github.io" # the base hostname & protocol
|
||||||
|
|
||||||
|
# Theme settings
|
||||||
|
logo: /assets/img/logo.png # path to logo (create this if you want a logo)
|
||||||
|
show_downloads: true # show download buttons for your repo
|
||||||
|
|
||||||
plugins:
|
plugins:
|
||||||
- jekyll-relative-links
|
- jekyll-relative-links
|
||||||
- jekyll-seo-tag
|
- jekyll-seo-tag
|
||||||
|
- jekyll-remote-theme
|
||||||
|
- jekyll-github-metadata
|
||||||
|
|
||||||
# Enable relative links
|
# Enable relative links
|
||||||
relative_links:
|
relative_links:
|
||||||
@@ -16,7 +31,39 @@ header_pages:
|
|||||||
- index.md
|
- index.md
|
||||||
- getting-started.md
|
- getting-started.md
|
||||||
- api.md
|
- api.md
|
||||||
|
- usage.md
|
||||||
|
- tools/tools.md
|
||||||
|
- development/development.md
|
||||||
|
- troubleshooting.md
|
||||||
- contributing.md
|
- contributing.md
|
||||||
|
- roadmap.md
|
||||||
|
|
||||||
|
# Collections
|
||||||
|
collections:
|
||||||
|
tools:
|
||||||
|
output: true
|
||||||
|
permalink: /:collection/:name
|
||||||
|
development:
|
||||||
|
output: true
|
||||||
|
permalink: /:collection/:name
|
||||||
|
|
||||||
|
# Default layouts
|
||||||
|
defaults:
|
||||||
|
- scope:
|
||||||
|
path: ""
|
||||||
|
type: "pages"
|
||||||
|
values:
|
||||||
|
layout: "default"
|
||||||
|
- scope:
|
||||||
|
path: "tools"
|
||||||
|
type: "tools"
|
||||||
|
values:
|
||||||
|
layout: "default"
|
||||||
|
- scope:
|
||||||
|
path: "development"
|
||||||
|
type: "development"
|
||||||
|
values:
|
||||||
|
layout: "default"
|
||||||
|
|
||||||
# Exclude files from processing
|
# Exclude files from processing
|
||||||
exclude:
|
exclude:
|
||||||
@@ -24,3 +71,8 @@ exclude:
|
|||||||
- Gemfile.lock
|
- Gemfile.lock
|
||||||
- node_modules
|
- node_modules
|
||||||
- vendor
|
- vendor
|
||||||
|
|
||||||
|
# Sass settings
|
||||||
|
sass:
|
||||||
|
style: compressed
|
||||||
|
sass_dir: _sass
|
||||||
191
docs/api.md
191
docs/api.md
@@ -1,6 +1,191 @@
|
|||||||
# API Documentation
|
# 🚀 Home Assistant MCP API Documentation
|
||||||
|
|
||||||
This section details the available API endpoints for the Home Assistant MCP Server.
|
 
|
||||||
|
|
||||||
|
## 🌟 Quick Start
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Get API schema with caching
|
||||||
|
curl -X GET http://localhost:3000/mcp \
|
||||||
|
-H "Cache-Control: max-age=3600" # Cache for 1 hour
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🔌 Core Functions ⚙️
|
||||||
|
|
||||||
|
### State Management (`/api/state`)
|
||||||
|
```http
|
||||||
|
GET /api/state?cache=true # Enable client-side caching
|
||||||
|
POST /api/state
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example Request:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"context": "living_room",
|
||||||
|
"state": {
|
||||||
|
"lights": "on",
|
||||||
|
"temperature": 22
|
||||||
|
},
|
||||||
|
"_cache": { // Optional caching config
|
||||||
|
"ttl": 300, // 5 minutes
|
||||||
|
"tags": ["lights", "climate"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## ⚡ Action Endpoints
|
||||||
|
|
||||||
|
### Execute Action with Cache Validation
|
||||||
|
```http
|
||||||
|
POST /api/action
|
||||||
|
If-None-Match: "etag_value" // Prevent duplicate actions
|
||||||
|
```
|
||||||
|
|
||||||
|
**Batch Processing:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"actions": [
|
||||||
|
{ "action": "🌞 Morning Routine", "params": { "brightness": 80 } },
|
||||||
|
{ "action": "❄️ AC Control", "params": { "temp": 21 } }
|
||||||
|
],
|
||||||
|
"_parallel": true // Execute actions concurrently
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🔍 Query Functions
|
||||||
|
|
||||||
|
### Available Actions with ETag
|
||||||
|
```http
|
||||||
|
GET /api/actions
|
||||||
|
ETag: "a1b2c3d4" // Client-side cache validation
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response Headers:**
|
||||||
|
```
|
||||||
|
Cache-Control: public, max-age=86400 // 24-hour cache
|
||||||
|
ETag: "a1b2c3d4"
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🌐 WebSocket Events
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
const ws = new WebSocket('wss://ha-mcp/ws');
|
||||||
|
ws.onmessage = ({ data }) => {
|
||||||
|
const event = JSON.parse(data);
|
||||||
|
if(event.type === 'STATE_UPDATE') {
|
||||||
|
updateUI(event.payload); // 🎨 Real-time UI sync
|
||||||
|
}
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🗃️ Caching Strategies
|
||||||
|
|
||||||
|
### Client-Side Caching
|
||||||
|
```http
|
||||||
|
GET /api/devices
|
||||||
|
Cache-Control: max-age=300, stale-while-revalidate=60
|
||||||
|
```
|
||||||
|
|
||||||
|
### Server-Side Cache-Control
|
||||||
|
```typescript
|
||||||
|
// Example middleware configuration
|
||||||
|
app.use(
|
||||||
|
cacheMiddleware({
|
||||||
|
ttl: 60 * 5, // 5 minutes
|
||||||
|
paths: ['/api/devices', '/mcp'],
|
||||||
|
vary: ['Authorization'] // User-specific caching
|
||||||
|
})
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
## ❌ Error Handling
|
||||||
|
|
||||||
|
**429 Too Many Requests:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"error": {
|
||||||
|
"code": "RATE_LIMITED",
|
||||||
|
"message": "Slow down! 🐢",
|
||||||
|
"retry_after": 30,
|
||||||
|
"docs": "https://ha-mcp/docs/rate-limits"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🚦 Rate Limiting Tiers
|
||||||
|
|
||||||
|
| Tier | Requests/min | Features |
|
||||||
|
|---------------|--------------|------------------------|
|
||||||
|
| Guest | 10 | Basic read-only |
|
||||||
|
| User | 100 | Full access |
|
||||||
|
| Power User | 500 | Priority queue |
|
||||||
|
| Integration | 1000 | Bulk operations |
|
||||||
|
|
||||||
|
## 🛠️ Example Usage
|
||||||
|
|
||||||
|
### Smart Cache Refresh
|
||||||
|
```javascript
|
||||||
|
async function getDevices() {
|
||||||
|
const response = await fetch('/api/devices', {
|
||||||
|
headers: {
|
||||||
|
'If-None-Match': localStorage.getItem('devicesETag')
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
if(response.status === 304) { // Not Modified
|
||||||
|
return JSON.parse(localStorage.devicesCache);
|
||||||
|
}
|
||||||
|
|
||||||
|
const data = await response.json();
|
||||||
|
localStorage.setItem('devicesETag', response.headers.get('ETag'));
|
||||||
|
localStorage.setItem('devicesCache', JSON.stringify(data));
|
||||||
|
return data;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🔒 Security Middleware (Enhanced)
|
||||||
|
|
||||||
|
### Cache-Aware Rate Limiting
|
||||||
|
```typescript
|
||||||
|
app.use(
|
||||||
|
rateLimit({
|
||||||
|
windowMs: 15 * 60 * 1000, // 15 minutes
|
||||||
|
max: 100, // Limit each IP to 100 requests per window
|
||||||
|
cache: new RedisStore(), // Distributed cache
|
||||||
|
keyGenerator: (req) => {
|
||||||
|
return `${req.ip}-${req.headers.authorization}`;
|
||||||
|
}
|
||||||
|
})
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Security Headers
|
||||||
|
```http
|
||||||
|
Content-Security-Policy: default-src 'self';
|
||||||
|
Strict-Transport-Security: max-age=31536000;
|
||||||
|
X-Content-Type-Options: nosniff;
|
||||||
|
Cache-Control: public, max-age=600;
|
||||||
|
ETag: "abc123"
|
||||||
|
```
|
||||||
|
|
||||||
|
## 📘 Best Practices
|
||||||
|
|
||||||
|
1. **Cache Wisely:** Use `ETag` and `Cache-Control` headers for state data
|
||||||
|
2. **Batch Operations:** Combine requests using `/api/actions/batch`
|
||||||
|
3. **WebSocket First:** Prefer real-time updates over polling
|
||||||
|
4. **Error Recovery:** Implement exponential backoff with jitter
|
||||||
|
5. **Cache Invalidation:** Use tags for bulk invalidation
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph LR
|
||||||
|
A[Client] -->|Cached Request| B{CDN}
|
||||||
|
B -->|Cache Hit| C[Return 304]
|
||||||
|
B -->|Cache Miss| D[Origin Server]
|
||||||
|
D -->|Response| B
|
||||||
|
B -->|Response| A
|
||||||
|
```
|
||||||
|
|
||||||
|
> Pro Tip: Use `curl -I` to inspect cache headers! 🔍
|
||||||
|
|
||||||
## Device Control
|
## Device Control
|
||||||
|
|
||||||
@@ -85,7 +270,7 @@ This section details the available API endpoints for the Home Assistant MCP Serv
|
|||||||
|
|
||||||
## Automation Management
|
## Automation Management
|
||||||
|
|
||||||
For automation management details and endpoints, please refer to the [Tools Documentation](tools/README.md).
|
For automation management details and endpoints, please refer to the [Tools Documentation](tools/tools.md).
|
||||||
|
|
||||||
## Security Considerations
|
## Security Considerations
|
||||||
|
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ Begin your journey with the Home Assistant MCP Server by following these steps:
|
|||||||
|
|
||||||
- **API Documentation:** Read the [API Documentation](api.md) for available endpoints.
|
- **API Documentation:** Read the [API Documentation](api.md) for available endpoints.
|
||||||
- **Real-Time Updates:** Learn about [Server-Sent Events](sse-api.md) for live communication.
|
- **Real-Time Updates:** Learn about [Server-Sent Events](sse-api.md) for live communication.
|
||||||
- **Tools:** Explore available [Tools](tools/README.md) for device control and automation.
|
- **Tools:** Explore available [Tools](tools/tools.md) for device control and automation.
|
||||||
- **Configuration:** Refer to the [Configuration Guide](configuration.md) for setup and advanced settings.
|
- **Configuration:** Refer to the [Configuration Guide](configuration.md) for setup and advanced settings.
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
@@ -19,7 +19,7 @@ If you encounter any issues:
|
|||||||
For contributors:
|
For contributors:
|
||||||
1. Fork the repository.
|
1. Fork the repository.
|
||||||
2. Create a feature branch.
|
2. Create a feature branch.
|
||||||
3. Follow the [Development Guide](development/README.md) for contribution guidelines.
|
3. Follow the [Development Guide](development/development.md) for contribution guidelines.
|
||||||
4. Submit a pull request with your enhancements.
|
4. Submit a pull request with your enhancements.
|
||||||
|
|
||||||
## Support
|
## Support
|
||||||
|
|||||||
@@ -4,6 +4,29 @@ title: Home
|
|||||||
nav_order: 1
|
nav_order: 1
|
||||||
---
|
---
|
||||||
|
|
||||||
|
# 📚 Home Assistant MCP Documentation
|
||||||
|
|
||||||
|
Welcome to the documentation for the Home Assistant MCP (Model Context Protocol) Server.
|
||||||
|
|
||||||
|
## 📑 Documentation Index
|
||||||
|
|
||||||
|
- [Getting Started Guide](getting-started.md)
|
||||||
|
- [API Documentation](api.md)
|
||||||
|
- [Troubleshooting](troubleshooting.md)
|
||||||
|
- [Contributing Guide](contributing.md)
|
||||||
|
|
||||||
|
For project overview, installation, and general information, please see our [main README](../README.md).
|
||||||
|
|
||||||
|
## 🔗 Quick Links
|
||||||
|
|
||||||
|
- [GitHub Repository](https://github.com/jango-blockchained/homeassistant-mcp)
|
||||||
|
- [Issue Tracker](https://github.com/jango-blockchained/homeassistant-mcp/issues)
|
||||||
|
- [GitHub Discussions](https://github.com/jango-blockchained/homeassistant-mcp/discussions)
|
||||||
|
|
||||||
|
## 📝 License
|
||||||
|
|
||||||
|
This project is licensed under the MIT License. See [LICENSE](../LICENSE) for details.
|
||||||
|
|
||||||
# Model Context Protocol (MCP) Server
|
# Model Context Protocol (MCP) Server
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
@@ -56,7 +79,7 @@ The Model Context Protocol (MCP) Server is a cutting-edge bridge between Home As
|
|||||||
- Security Settings
|
- Security Settings
|
||||||
- Performance Tuning
|
- Performance Tuning
|
||||||
|
|
||||||
6. [Development Guide](development/README.md)
|
6. [Development Guide](development/development.md)
|
||||||
- Project Structure
|
- Project Structure
|
||||||
- Contributing Guidelines
|
- Contributing Guidelines
|
||||||
- Testing
|
- Testing
|
||||||
|
|||||||
@@ -158,7 +158,7 @@ A: Adjust SSE_MAX_CLIENTS in configuration or clean up stale connections.
|
|||||||
1. Documentation
|
1. Documentation
|
||||||
- [API Reference](./API.md)
|
- [API Reference](./API.md)
|
||||||
- [Configuration Guide](./configuration/README.md)
|
- [Configuration Guide](./configuration/README.md)
|
||||||
- [Development Guide](./development/README.md)
|
- [Development Guide](./development/development.md)
|
||||||
|
|
||||||
2. Community
|
2. Community
|
||||||
- GitHub Issues
|
- GitHub Issues
|
||||||
|
|||||||
@@ -21,13 +21,13 @@ This guide explains how to use the Home Assistant MCP Server for smart home devi
|
|||||||
- See [API Documentation](api.md) for details.
|
- See [API Documentation](api.md) for details.
|
||||||
|
|
||||||
2. **Tool Integrations:**
|
2. **Tool Integrations:**
|
||||||
- Multiple tools are available (see [Tools Documentation](tools/README.md)), for tasks like automation management and notifications.
|
- Multiple tools are available (see [Tools Documentation](tools/tools.md)), for tasks like automation management and notifications.
|
||||||
|
|
||||||
3. **Security Settings:**
|
3. **Security Settings:**
|
||||||
- Configure token-based authentication and environment variables as per the [Configuration Guide](getting-started/configuration.md).
|
- Configure token-based authentication and environment variables as per the [Configuration Guide](getting-started/configuration.md).
|
||||||
|
|
||||||
4. **Customization and Extensions:**
|
4. **Customization and Extensions:**
|
||||||
- Extend server functionality by developing new tools as outlined in the [Development Guide](development/README.md).
|
- Extend server functionality by developing new tools as outlined in the [Development Guide](development/development.md).
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
|
|||||||
91
examples/README.md
Normal file
91
examples/README.md
Normal file
@@ -0,0 +1,91 @@
|
|||||||
|
# Speech-to-Text Examples
|
||||||
|
|
||||||
|
This directory contains examples demonstrating how to use the speech-to-text integration with wake word detection.
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
1. Make sure you have Docker installed and running
|
||||||
|
2. Build and start the services:
|
||||||
|
```bash
|
||||||
|
docker-compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
## Running the Example
|
||||||
|
|
||||||
|
1. Install dependencies:
|
||||||
|
```bash
|
||||||
|
npm install
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Run the example:
|
||||||
|
```bash
|
||||||
|
npm run example:speech
|
||||||
|
```
|
||||||
|
|
||||||
|
Or using `ts-node` directly:
|
||||||
|
```bash
|
||||||
|
npx ts-node examples/speech-to-text-example.ts
|
||||||
|
```
|
||||||
|
|
||||||
|
## Features Demonstrated
|
||||||
|
|
||||||
|
1. **Wake Word Detection**
|
||||||
|
- Listens for wake words: "hey jarvis", "ok google", "alexa"
|
||||||
|
- Automatically saves audio when wake word is detected
|
||||||
|
- Transcribes the detected speech
|
||||||
|
|
||||||
|
2. **Manual Transcription**
|
||||||
|
- Example of how to transcribe audio files manually
|
||||||
|
- Supports different models and configurations
|
||||||
|
|
||||||
|
3. **Event Handling**
|
||||||
|
- Wake word detection events
|
||||||
|
- Transcription results
|
||||||
|
- Progress updates
|
||||||
|
- Error handling
|
||||||
|
|
||||||
|
## Example Output
|
||||||
|
|
||||||
|
When a wake word is detected, you'll see output like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
🎤 Wake word detected!
|
||||||
|
Timestamp: 20240203_123456
|
||||||
|
Audio file: /path/to/audio/wake_word_20240203_123456.wav
|
||||||
|
Metadata file: /path/to/audio/wake_word_20240203_123456.wav.json
|
||||||
|
|
||||||
|
📝 Transcription result:
|
||||||
|
Full text: This is what was said after the wake word.
|
||||||
|
|
||||||
|
Segments:
|
||||||
|
1. [0.00s - 1.52s] (95.5% confidence)
|
||||||
|
"This is what was said"
|
||||||
|
2. [1.52s - 2.34s] (98.2% confidence)
|
||||||
|
"after the wake word."
|
||||||
|
```
|
||||||
|
|
||||||
|
## Customization
|
||||||
|
|
||||||
|
You can customize the behavior by:
|
||||||
|
|
||||||
|
1. Changing the wake word models in `docker/speech/Dockerfile`
|
||||||
|
2. Modifying transcription options in the example file
|
||||||
|
3. Adding your own event handlers
|
||||||
|
4. Implementing different audio processing logic
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
1. **Docker Issues**
|
||||||
|
- Make sure Docker is running
|
||||||
|
- Check container logs: `docker-compose logs fast-whisper`
|
||||||
|
- Verify container is up: `docker ps`
|
||||||
|
|
||||||
|
2. **Audio Issues**
|
||||||
|
- Check audio device permissions
|
||||||
|
- Verify audio file format (WAV files recommended)
|
||||||
|
- Check audio file permissions
|
||||||
|
|
||||||
|
3. **Performance Issues**
|
||||||
|
- Try using a smaller model (tiny.en or base.en)
|
||||||
|
- Adjust beam size and patience parameters
|
||||||
|
- Consider using GPU acceleration if available
|
||||||
91
examples/speech-to-text-example.ts
Normal file
91
examples/speech-to-text-example.ts
Normal file
@@ -0,0 +1,91 @@
|
|||||||
|
import { SpeechToText, TranscriptionResult, WakeWordEvent } from '../src/speech/speechToText';
|
||||||
|
import path from 'path';
|
||||||
|
|
||||||
|
async function main() {
|
||||||
|
// Initialize the speech-to-text service
|
||||||
|
const speech = new SpeechToText('fast-whisper');
|
||||||
|
|
||||||
|
// Check if the service is available
|
||||||
|
const isHealthy = await speech.checkHealth();
|
||||||
|
if (!isHealthy) {
|
||||||
|
console.error('Speech service is not available. Make sure Docker is running and the fast-whisper container is up.');
|
||||||
|
console.error('Run: docker-compose up -d');
|
||||||
|
process.exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log('Speech service is ready!');
|
||||||
|
console.log('Listening for wake words: "hey jarvis", "ok google", "alexa"');
|
||||||
|
console.log('Press Ctrl+C to exit');
|
||||||
|
|
||||||
|
// Set up event handlers
|
||||||
|
speech.on('wake_word', (event: WakeWordEvent) => {
|
||||||
|
console.log('\n🎤 Wake word detected!');
|
||||||
|
console.log(' Timestamp:', event.timestamp);
|
||||||
|
console.log(' Audio file:', event.audioFile);
|
||||||
|
console.log(' Metadata file:', event.metadataFile);
|
||||||
|
});
|
||||||
|
|
||||||
|
speech.on('transcription', (event: { audioFile: string; result: TranscriptionResult }) => {
|
||||||
|
console.log('\n📝 Transcription result:');
|
||||||
|
console.log(' Full text:', event.result.text);
|
||||||
|
console.log('\n Segments:');
|
||||||
|
event.result.segments.forEach((segment, index) => {
|
||||||
|
console.log(` ${index + 1}. [${segment.start.toFixed(2)}s - ${segment.end.toFixed(2)}s] (${(segment.confidence * 100).toFixed(1)}% confidence)`);
|
||||||
|
console.log(` "${segment.text}"`);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
speech.on('progress', (event: { type: string; data: string }) => {
|
||||||
|
if (event.type === 'stderr' && !event.data.includes('Loading model')) {
|
||||||
|
console.error('❌ Error:', event.data);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
speech.on('error', (error: Error) => {
|
||||||
|
console.error('❌ Error:', error.message);
|
||||||
|
});
|
||||||
|
|
||||||
|
// Example of manual transcription
|
||||||
|
async function transcribeFile(filepath: string) {
|
||||||
|
try {
|
||||||
|
console.log(`\n🎯 Manually transcribing: ${filepath}`);
|
||||||
|
const result = await speech.transcribeAudio(filepath, {
|
||||||
|
model: 'base.en', // You can change this to tiny.en, small.en, medium.en, or large-v2
|
||||||
|
language: 'en',
|
||||||
|
temperature: 0,
|
||||||
|
beamSize: 5
|
||||||
|
});
|
||||||
|
|
||||||
|
console.log('\n📝 Transcription result:');
|
||||||
|
console.log(' Text:', result.text);
|
||||||
|
} catch (error) {
|
||||||
|
console.error('❌ Transcription failed:', error instanceof Error ? error.message : error);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create audio directory if it doesn't exist
|
||||||
|
const audioDir = path.join(__dirname, '..', 'audio');
|
||||||
|
if (!require('fs').existsSync(audioDir)) {
|
||||||
|
require('fs').mkdirSync(audioDir, { recursive: true });
|
||||||
|
}
|
||||||
|
|
||||||
|
// Start wake word detection
|
||||||
|
speech.startWakeWordDetection(audioDir);
|
||||||
|
|
||||||
|
// Example: You can also manually transcribe files
|
||||||
|
// Uncomment the following line and replace with your audio file:
|
||||||
|
// await transcribeFile('/path/to/your/audio.wav');
|
||||||
|
|
||||||
|
// Keep the process running
|
||||||
|
process.on('SIGINT', () => {
|
||||||
|
console.log('\nStopping speech service...');
|
||||||
|
speech.stopWakeWordDetection();
|
||||||
|
process.exit(0);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// Run the example
|
||||||
|
main().catch(error => {
|
||||||
|
console.error('Fatal error:', error);
|
||||||
|
process.exit(1);
|
||||||
|
});
|
||||||
@@ -21,7 +21,8 @@
|
|||||||
"profile": "bun --inspect src/index.ts",
|
"profile": "bun --inspect src/index.ts",
|
||||||
"clean": "rm -rf dist .bun coverage",
|
"clean": "rm -rf dist .bun coverage",
|
||||||
"typecheck": "bun x tsc --noEmit",
|
"typecheck": "bun x tsc --noEmit",
|
||||||
"preinstall": "bun install --frozen-lockfile"
|
"preinstall": "bun install --frozen-lockfile",
|
||||||
|
"example:speech": "bun run examples/speech-to-text-example.ts"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@elysiajs/cors": "^1.2.0",
|
"@elysiajs/cors": "^1.2.0",
|
||||||
|
|||||||
@@ -33,6 +33,21 @@ export const AppConfigSchema = z.object({
|
|||||||
HASS_HOST: z.string().default("http://192.168.178.63:8123"),
|
HASS_HOST: z.string().default("http://192.168.178.63:8123"),
|
||||||
HASS_TOKEN: z.string().optional(),
|
HASS_TOKEN: z.string().optional(),
|
||||||
|
|
||||||
|
/** Speech Features Configuration */
|
||||||
|
SPEECH: z.object({
|
||||||
|
ENABLED: z.boolean().default(false),
|
||||||
|
WAKE_WORD_ENABLED: z.boolean().default(false),
|
||||||
|
SPEECH_TO_TEXT_ENABLED: z.boolean().default(false),
|
||||||
|
WHISPER_MODEL_PATH: z.string().default("/models"),
|
||||||
|
WHISPER_MODEL_TYPE: z.string().default("base"),
|
||||||
|
}).default({
|
||||||
|
ENABLED: false,
|
||||||
|
WAKE_WORD_ENABLED: false,
|
||||||
|
SPEECH_TO_TEXT_ENABLED: false,
|
||||||
|
WHISPER_MODEL_PATH: "/models",
|
||||||
|
WHISPER_MODEL_TYPE: "base",
|
||||||
|
}),
|
||||||
|
|
||||||
/** Security Configuration */
|
/** Security Configuration */
|
||||||
JWT_SECRET: z.string().default("your-secret-key"),
|
JWT_SECRET: z.string().default("your-secret-key"),
|
||||||
RATE_LIMIT: z.object({
|
RATE_LIMIT: z.object({
|
||||||
@@ -113,4 +128,11 @@ export const APP_CONFIG = AppConfigSchema.parse({
|
|||||||
LOG_REQUESTS: process.env.LOG_REQUESTS === "true",
|
LOG_REQUESTS: process.env.LOG_REQUESTS === "true",
|
||||||
},
|
},
|
||||||
VERSION: "0.1.0",
|
VERSION: "0.1.0",
|
||||||
|
SPEECH: {
|
||||||
|
ENABLED: process.env.ENABLE_SPEECH_FEATURES === "true",
|
||||||
|
WAKE_WORD_ENABLED: process.env.ENABLE_WAKE_WORD === "true",
|
||||||
|
SPEECH_TO_TEXT_ENABLED: process.env.ENABLE_SPEECH_TO_TEXT === "true",
|
||||||
|
WHISPER_MODEL_PATH: process.env.WHISPER_MODEL_PATH || "/models",
|
||||||
|
WHISPER_MODEL_TYPE: process.env.WHISPER_MODEL_TYPE || "base",
|
||||||
|
},
|
||||||
});
|
});
|
||||||
|
|||||||
20
src/index.ts
20
src/index.ts
@@ -25,6 +25,8 @@ import {
|
|||||||
climateCommands,
|
climateCommands,
|
||||||
type Command,
|
type Command,
|
||||||
} from "./commands.js";
|
} from "./commands.js";
|
||||||
|
import { speechService } from "./speech/index.js";
|
||||||
|
import { APP_CONFIG } from "./config/app.config.js";
|
||||||
|
|
||||||
// Load environment variables based on NODE_ENV
|
// Load environment variables based on NODE_ENV
|
||||||
const envFile =
|
const envFile =
|
||||||
@@ -129,8 +131,19 @@ app.get("/health", () => ({
|
|||||||
status: "ok",
|
status: "ok",
|
||||||
timestamp: new Date().toISOString(),
|
timestamp: new Date().toISOString(),
|
||||||
version: "0.1.0",
|
version: "0.1.0",
|
||||||
|
speech_enabled: APP_CONFIG.SPEECH.ENABLED,
|
||||||
|
wake_word_enabled: APP_CONFIG.SPEECH.WAKE_WORD_ENABLED,
|
||||||
|
speech_to_text_enabled: APP_CONFIG.SPEECH.SPEECH_TO_TEXT_ENABLED,
|
||||||
}));
|
}));
|
||||||
|
|
||||||
|
// Initialize speech service if enabled
|
||||||
|
if (APP_CONFIG.SPEECH.ENABLED) {
|
||||||
|
console.log("Initializing speech service...");
|
||||||
|
speechService.initialize().catch((error) => {
|
||||||
|
console.error("Failed to initialize speech service:", error);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
// Create API endpoints for each tool
|
// Create API endpoints for each tool
|
||||||
tools.forEach((tool) => {
|
tools.forEach((tool) => {
|
||||||
app.post(`/api/tools/${tool.name}`, async ({ body }: { body: Record<string, unknown> }) => {
|
app.post(`/api/tools/${tool.name}`, async ({ body }: { body: Record<string, unknown> }) => {
|
||||||
@@ -145,7 +158,12 @@ app.listen(PORT, () => {
|
|||||||
});
|
});
|
||||||
|
|
||||||
// Handle server shutdown
|
// Handle server shutdown
|
||||||
process.on("SIGTERM", () => {
|
process.on("SIGTERM", async () => {
|
||||||
console.log("Received SIGTERM. Shutting down gracefully...");
|
console.log("Received SIGTERM. Shutting down gracefully...");
|
||||||
|
if (APP_CONFIG.SPEECH.ENABLED) {
|
||||||
|
await speechService.shutdown().catch((error) => {
|
||||||
|
console.error("Error shutting down speech service:", error);
|
||||||
|
});
|
||||||
|
}
|
||||||
process.exit(0);
|
process.exit(0);
|
||||||
});
|
});
|
||||||
|
|||||||
0
src/speech/__tests__/fixtures/test.wav
Normal file
0
src/speech/__tests__/fixtures/test.wav
Normal file
116
src/speech/__tests__/speechToText.test.ts
Normal file
116
src/speech/__tests__/speechToText.test.ts
Normal file
@@ -0,0 +1,116 @@
|
|||||||
|
import { SpeechToText, WakeWordEvent, TranscriptionError } from '../speechToText';
|
||||||
|
import fs from 'fs';
|
||||||
|
import path from 'path';
|
||||||
|
|
||||||
|
describe('SpeechToText', () => {
|
||||||
|
let speechToText: SpeechToText;
|
||||||
|
const testAudioDir = path.join(__dirname, 'test_audio');
|
||||||
|
|
||||||
|
beforeEach(() => {
|
||||||
|
speechToText = new SpeechToText('fast-whisper');
|
||||||
|
// Create test audio directory if it doesn't exist
|
||||||
|
if (!fs.existsSync(testAudioDir)) {
|
||||||
|
fs.mkdirSync(testAudioDir, { recursive: true });
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
afterEach(() => {
|
||||||
|
speechToText.stopWakeWordDetection();
|
||||||
|
// Clean up test files
|
||||||
|
if (fs.existsSync(testAudioDir)) {
|
||||||
|
fs.rmSync(testAudioDir, { recursive: true, force: true });
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
describe('checkHealth', () => {
|
||||||
|
it('should handle Docker not being available', async () => {
|
||||||
|
const isHealthy = await speechToText.checkHealth();
|
||||||
|
expect(isHealthy).toBeDefined();
|
||||||
|
expect(isHealthy).toBe(false);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe('wake word detection', () => {
|
||||||
|
it('should detect new audio files and emit wake word events', (done) => {
|
||||||
|
const testFile = path.join(testAudioDir, 'wake_word_test_123456.wav');
|
||||||
|
const testMetadata = `${testFile}.json`;
|
||||||
|
|
||||||
|
speechToText.startWakeWordDetection(testAudioDir);
|
||||||
|
|
||||||
|
speechToText.on('wake_word', (event: WakeWordEvent) => {
|
||||||
|
expect(event).toBeDefined();
|
||||||
|
expect(event.audioFile).toBe(testFile);
|
||||||
|
expect(event.metadataFile).toBe(testMetadata);
|
||||||
|
expect(event.timestamp).toBe('123456');
|
||||||
|
done();
|
||||||
|
});
|
||||||
|
|
||||||
|
// Create a test audio file to trigger the event
|
||||||
|
fs.writeFileSync(testFile, 'test audio content');
|
||||||
|
}, 1000);
|
||||||
|
|
||||||
|
it('should handle transcription errors when Docker is not available', (done) => {
|
||||||
|
const testFile = path.join(testAudioDir, 'wake_word_test_123456.wav');
|
||||||
|
|
||||||
|
let errorEmitted = false;
|
||||||
|
let wakeWordEmitted = false;
|
||||||
|
|
||||||
|
const checkDone = () => {
|
||||||
|
if (errorEmitted && wakeWordEmitted) {
|
||||||
|
done();
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
speechToText.on('error', (error) => {
|
||||||
|
expect(error).toBeDefined();
|
||||||
|
expect(error).toBeInstanceOf(TranscriptionError);
|
||||||
|
expect(error.message).toContain('Failed to start Docker process');
|
||||||
|
errorEmitted = true;
|
||||||
|
checkDone();
|
||||||
|
});
|
||||||
|
|
||||||
|
speechToText.on('wake_word', () => {
|
||||||
|
wakeWordEmitted = true;
|
||||||
|
checkDone();
|
||||||
|
});
|
||||||
|
|
||||||
|
speechToText.startWakeWordDetection(testAudioDir);
|
||||||
|
|
||||||
|
// Create a test audio file to trigger the event
|
||||||
|
fs.writeFileSync(testFile, 'test audio content');
|
||||||
|
}, 1000);
|
||||||
|
});
|
||||||
|
|
||||||
|
describe('transcribeAudio', () => {
|
||||||
|
it('should handle Docker not being available for transcription', async () => {
|
||||||
|
await expect(
|
||||||
|
speechToText.transcribeAudio('/audio/test.wav')
|
||||||
|
).rejects.toThrow(TranscriptionError);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should emit progress events on error', (done) => {
|
||||||
|
let progressEmitted = false;
|
||||||
|
let errorThrown = false;
|
||||||
|
|
||||||
|
const checkDone = () => {
|
||||||
|
if (progressEmitted && errorThrown) {
|
||||||
|
done();
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
speechToText.on('progress', (event: { type: string; data: string }) => {
|
||||||
|
expect(event.type).toBe('stderr');
|
||||||
|
expect(event.data).toBe('Failed to start Docker process');
|
||||||
|
progressEmitted = true;
|
||||||
|
checkDone();
|
||||||
|
});
|
||||||
|
|
||||||
|
speechToText.transcribeAudio('/audio/test.wav')
|
||||||
|
.catch((error) => {
|
||||||
|
expect(error).toBeInstanceOf(TranscriptionError);
|
||||||
|
errorThrown = true;
|
||||||
|
checkDone();
|
||||||
|
});
|
||||||
|
}, 1000);
|
||||||
|
});
|
||||||
|
});
|
||||||
110
src/speech/index.ts
Normal file
110
src/speech/index.ts
Normal file
@@ -0,0 +1,110 @@
|
|||||||
|
import { APP_CONFIG } from "../config/app.config.js";
|
||||||
|
import { logger } from "../utils/logger.js";
|
||||||
|
import type { IWakeWordDetector, ISpeechToText } from "./types.js";
|
||||||
|
|
||||||
|
class SpeechService {
|
||||||
|
private static instance: SpeechService | null = null;
|
||||||
|
private isInitialized: boolean = false;
|
||||||
|
private wakeWordDetector: IWakeWordDetector | null = null;
|
||||||
|
private speechToText: ISpeechToText | null = null;
|
||||||
|
|
||||||
|
private constructor() { }
|
||||||
|
|
||||||
|
public static getInstance(): SpeechService {
|
||||||
|
if (!SpeechService.instance) {
|
||||||
|
SpeechService.instance = new SpeechService();
|
||||||
|
}
|
||||||
|
return SpeechService.instance;
|
||||||
|
}
|
||||||
|
|
||||||
|
public async initialize(): Promise<void> {
|
||||||
|
if (this.isInitialized) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!APP_CONFIG.SPEECH.ENABLED) {
|
||||||
|
logger.info("Speech features are disabled. Skipping initialization.");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
try {
|
||||||
|
// Initialize components based on configuration
|
||||||
|
if (APP_CONFIG.SPEECH.WAKE_WORD_ENABLED) {
|
||||||
|
logger.info("Initializing wake word detection...");
|
||||||
|
// Dynamic import to avoid loading the module if not needed
|
||||||
|
const { WakeWordDetector } = await import("./wakeWordDetector.js");
|
||||||
|
this.wakeWordDetector = new WakeWordDetector() as IWakeWordDetector;
|
||||||
|
await this.wakeWordDetector.initialize();
|
||||||
|
}
|
||||||
|
|
||||||
|
if (APP_CONFIG.SPEECH.SPEECH_TO_TEXT_ENABLED) {
|
||||||
|
logger.info("Initializing speech-to-text...");
|
||||||
|
// Dynamic import to avoid loading the module if not needed
|
||||||
|
const { SpeechToText } = await import("./speechToText.js");
|
||||||
|
this.speechToText = new SpeechToText({
|
||||||
|
modelPath: APP_CONFIG.SPEECH.WHISPER_MODEL_PATH,
|
||||||
|
modelType: APP_CONFIG.SPEECH.WHISPER_MODEL_TYPE,
|
||||||
|
}) as ISpeechToText;
|
||||||
|
await this.speechToText.initialize();
|
||||||
|
}
|
||||||
|
|
||||||
|
this.isInitialized = true;
|
||||||
|
logger.info("Speech service initialized successfully");
|
||||||
|
} catch (error) {
|
||||||
|
logger.error("Failed to initialize speech service:", error);
|
||||||
|
throw error;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public async shutdown(): Promise<void> {
|
||||||
|
if (!this.isInitialized) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
try {
|
||||||
|
if (this.wakeWordDetector) {
|
||||||
|
await this.wakeWordDetector.shutdown();
|
||||||
|
this.wakeWordDetector = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (this.speechToText) {
|
||||||
|
await this.speechToText.shutdown();
|
||||||
|
this.speechToText = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
this.isInitialized = false;
|
||||||
|
logger.info("Speech service shut down successfully");
|
||||||
|
} catch (error) {
|
||||||
|
logger.error("Error during speech service shutdown:", error);
|
||||||
|
throw error;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public isEnabled(): boolean {
|
||||||
|
return APP_CONFIG.SPEECH.ENABLED;
|
||||||
|
}
|
||||||
|
|
||||||
|
public isWakeWordEnabled(): boolean {
|
||||||
|
return APP_CONFIG.SPEECH.WAKE_WORD_ENABLED;
|
||||||
|
}
|
||||||
|
|
||||||
|
public isSpeechToTextEnabled(): boolean {
|
||||||
|
return APP_CONFIG.SPEECH.SPEECH_TO_TEXT_ENABLED;
|
||||||
|
}
|
||||||
|
|
||||||
|
public getWakeWordDetector(): IWakeWordDetector {
|
||||||
|
if (!this.isInitialized || !this.wakeWordDetector) {
|
||||||
|
throw new Error("Wake word detector is not initialized");
|
||||||
|
}
|
||||||
|
return this.wakeWordDetector;
|
||||||
|
}
|
||||||
|
|
||||||
|
public getSpeechToText(): ISpeechToText {
|
||||||
|
if (!this.isInitialized || !this.speechToText) {
|
||||||
|
throw new Error("Speech-to-text is not initialized");
|
||||||
|
}
|
||||||
|
return this.speechToText;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
export const speechService = SpeechService.getInstance();
|
||||||
247
src/speech/speechToText.ts
Normal file
247
src/speech/speechToText.ts
Normal file
@@ -0,0 +1,247 @@
|
|||||||
|
import { spawn } from 'child_process';
|
||||||
|
import { EventEmitter } from 'events';
|
||||||
|
import { watch } from 'fs';
|
||||||
|
import path from 'path';
|
||||||
|
import { ISpeechToText, SpeechToTextConfig } from "./types.js";
|
||||||
|
|
||||||
|
export interface TranscriptionOptions {
|
||||||
|
model?: 'tiny.en' | 'base.en' | 'small.en' | 'medium.en' | 'large-v2';
|
||||||
|
language?: string;
|
||||||
|
temperature?: number;
|
||||||
|
beamSize?: number;
|
||||||
|
patience?: number;
|
||||||
|
device?: 'cpu' | 'cuda';
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface TranscriptionResult {
|
||||||
|
text: string;
|
||||||
|
segments: Array<{
|
||||||
|
text: string;
|
||||||
|
start: number;
|
||||||
|
end: number;
|
||||||
|
confidence: number;
|
||||||
|
}>;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface WakeWordEvent {
|
||||||
|
timestamp: string;
|
||||||
|
audioFile: string;
|
||||||
|
metadataFile: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
export class TranscriptionError extends Error {
|
||||||
|
constructor(message: string) {
|
||||||
|
super(message);
|
||||||
|
this.name = 'TranscriptionError';
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
export class SpeechToText extends EventEmitter implements ISpeechToText {
|
||||||
|
private containerName: string;
|
||||||
|
private audioWatcher?: ReturnType<typeof watch>;
|
||||||
|
private modelPath: string;
|
||||||
|
private modelType: string;
|
||||||
|
private isInitialized: boolean = false;
|
||||||
|
|
||||||
|
constructor(config: SpeechToTextConfig) {
|
||||||
|
super();
|
||||||
|
this.containerName = config.containerName || 'fast-whisper';
|
||||||
|
this.modelPath = config.modelPath;
|
||||||
|
this.modelType = config.modelType;
|
||||||
|
}
|
||||||
|
|
||||||
|
public async initialize(): Promise<void> {
|
||||||
|
if (this.isInitialized) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
// Initialization logic will be implemented here
|
||||||
|
await this.setupContainer();
|
||||||
|
this.isInitialized = true;
|
||||||
|
this.emit('ready');
|
||||||
|
} catch (error) {
|
||||||
|
this.emit('error', error);
|
||||||
|
throw error;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public async shutdown(): Promise<void> {
|
||||||
|
if (!this.isInitialized) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
// Cleanup logic will be implemented here
|
||||||
|
await this.cleanupContainer();
|
||||||
|
this.isInitialized = false;
|
||||||
|
this.emit('shutdown');
|
||||||
|
} catch (error) {
|
||||||
|
this.emit('error', error);
|
||||||
|
throw error;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public async transcribe(audioData: Buffer): Promise<string> {
|
||||||
|
if (!this.isInitialized) {
|
||||||
|
throw new Error("Speech-to-text service is not initialized");
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
// Transcription logic will be implemented here
|
||||||
|
this.emit('transcribing');
|
||||||
|
const result = await this.processAudio(audioData);
|
||||||
|
this.emit('transcribed', result);
|
||||||
|
return result;
|
||||||
|
} catch (error) {
|
||||||
|
this.emit('error', error);
|
||||||
|
throw error;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private async setupContainer(): Promise<void> {
|
||||||
|
// Container setup logic will be implemented here
|
||||||
|
await new Promise(resolve => setTimeout(resolve, 100)); // Placeholder
|
||||||
|
}
|
||||||
|
|
||||||
|
private async cleanupContainer(): Promise<void> {
|
||||||
|
// Container cleanup logic will be implemented here
|
||||||
|
await new Promise(resolve => setTimeout(resolve, 100)); // Placeholder
|
||||||
|
}
|
||||||
|
|
||||||
|
private async processAudio(audioData: Buffer): Promise<string> {
|
||||||
|
// Audio processing logic will be implemented here
|
||||||
|
await new Promise(resolve => setTimeout(resolve, 100)); // Placeholder
|
||||||
|
return "Transcription placeholder";
|
||||||
|
}
|
||||||
|
|
||||||
|
startWakeWordDetection(audioDir: string = './audio'): void {
|
||||||
|
// Watch for new audio files from wake word detection
|
||||||
|
this.audioWatcher = watch(audioDir, (eventType, filename) => {
|
||||||
|
if (eventType === 'rename' && filename && filename.startsWith('wake_word_') && filename.endsWith('.wav')) {
|
||||||
|
const audioFile = path.join(audioDir, filename);
|
||||||
|
const metadataFile = `${audioFile}.json`;
|
||||||
|
const parts = filename.split('_');
|
||||||
|
const timestamp = parts[parts.length - 1].split('.')[0];
|
||||||
|
|
||||||
|
// Emit wake word event
|
||||||
|
this.emit('wake_word', {
|
||||||
|
timestamp,
|
||||||
|
audioFile,
|
||||||
|
metadataFile
|
||||||
|
} as WakeWordEvent);
|
||||||
|
|
||||||
|
// Automatically transcribe the wake word audio
|
||||||
|
this.transcribeAudio(audioFile)
|
||||||
|
.then(result => {
|
||||||
|
this.emit('transcription', { audioFile, result });
|
||||||
|
})
|
||||||
|
.catch(error => {
|
||||||
|
this.emit('error', error);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
stopWakeWordDetection(): void {
|
||||||
|
if (this.audioWatcher) {
|
||||||
|
this.audioWatcher.close();
|
||||||
|
this.audioWatcher = undefined;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async transcribeAudio(
|
||||||
|
audioFilePath: string,
|
||||||
|
options: TranscriptionOptions = {}
|
||||||
|
): Promise<TranscriptionResult> {
|
||||||
|
const {
|
||||||
|
model = 'base.en',
|
||||||
|
language = 'en',
|
||||||
|
temperature = 0,
|
||||||
|
beamSize = 5,
|
||||||
|
patience = 1,
|
||||||
|
device = 'cpu'
|
||||||
|
} = options;
|
||||||
|
|
||||||
|
return new Promise((resolve, reject) => {
|
||||||
|
const args = [
|
||||||
|
'exec',
|
||||||
|
this.containerName,
|
||||||
|
'fast-whisper',
|
||||||
|
'--model', model,
|
||||||
|
'--language', language,
|
||||||
|
'--temperature', temperature.toString(),
|
||||||
|
'--beam-size', beamSize.toString(),
|
||||||
|
'--patience', patience.toString(),
|
||||||
|
'--device', device,
|
||||||
|
'--output-json',
|
||||||
|
audioFilePath
|
||||||
|
];
|
||||||
|
|
||||||
|
let process;
|
||||||
|
try {
|
||||||
|
process = spawn('docker', args);
|
||||||
|
} catch (error) {
|
||||||
|
this.emit('progress', { type: 'stderr', data: 'Failed to start Docker process' });
|
||||||
|
reject(new TranscriptionError('Failed to start Docker process'));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
let stdout = '';
|
||||||
|
let stderr = '';
|
||||||
|
|
||||||
|
process.stdout?.on('data', (data: Buffer) => {
|
||||||
|
stdout += data.toString();
|
||||||
|
this.emit('progress', { type: 'stdout', data: data.toString() });
|
||||||
|
});
|
||||||
|
|
||||||
|
process.stderr?.on('data', (data: Buffer) => {
|
||||||
|
stderr += data.toString();
|
||||||
|
this.emit('progress', { type: 'stderr', data: data.toString() });
|
||||||
|
});
|
||||||
|
|
||||||
|
process.on('error', (error: Error) => {
|
||||||
|
this.emit('progress', { type: 'stderr', data: error.message });
|
||||||
|
reject(new TranscriptionError(`Failed to execute Docker command: ${error.message}`));
|
||||||
|
});
|
||||||
|
|
||||||
|
process.on('close', (code: number) => {
|
||||||
|
if (code !== 0) {
|
||||||
|
reject(new TranscriptionError(`Transcription failed: ${stderr}`));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
try {
|
||||||
|
const result = JSON.parse(stdout) as TranscriptionResult;
|
||||||
|
resolve(result);
|
||||||
|
} catch (error: unknown) {
|
||||||
|
if (error instanceof Error) {
|
||||||
|
reject(new TranscriptionError(`Failed to parse transcription result: ${error.message}`));
|
||||||
|
} else {
|
||||||
|
reject(new TranscriptionError('Failed to parse transcription result: Unknown error'));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
async checkHealth(): Promise<boolean> {
|
||||||
|
try {
|
||||||
|
const process = spawn('docker', ['ps', '--filter', `name=${this.containerName}`, '--format', '{{.Status}}']);
|
||||||
|
|
||||||
|
return new Promise((resolve) => {
|
||||||
|
let output = '';
|
||||||
|
process.stdout?.on('data', (data: Buffer) => {
|
||||||
|
output += data.toString();
|
||||||
|
});
|
||||||
|
|
||||||
|
process.on('error', () => {
|
||||||
|
resolve(false);
|
||||||
|
});
|
||||||
|
|
||||||
|
process.on('close', (code: number) => {
|
||||||
|
resolve(code === 0 && output.toLowerCase().includes('up'));
|
||||||
|
});
|
||||||
|
});
|
||||||
|
} catch (error) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
20
src/speech/types.ts
Normal file
20
src/speech/types.ts
Normal file
@@ -0,0 +1,20 @@
|
|||||||
|
import { EventEmitter } from "events";
|
||||||
|
|
||||||
|
export interface IWakeWordDetector {
|
||||||
|
initialize(): Promise<void>;
|
||||||
|
shutdown(): Promise<void>;
|
||||||
|
startListening(): Promise<void>;
|
||||||
|
stopListening(): Promise<void>;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface ISpeechToText extends EventEmitter {
|
||||||
|
initialize(): Promise<void>;
|
||||||
|
shutdown(): Promise<void>;
|
||||||
|
transcribe(audioData: Buffer): Promise<string>;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface SpeechToTextConfig {
|
||||||
|
modelPath: string;
|
||||||
|
modelType: string;
|
||||||
|
containerName?: string;
|
||||||
|
}
|
||||||
64
src/speech/wakeWordDetector.ts
Normal file
64
src/speech/wakeWordDetector.ts
Normal file
@@ -0,0 +1,64 @@
|
|||||||
|
import { IWakeWordDetector } from "./types.js";
|
||||||
|
|
||||||
|
export class WakeWordDetector implements IWakeWordDetector {
|
||||||
|
private isListening: boolean = false;
|
||||||
|
private isInitialized: boolean = false;
|
||||||
|
|
||||||
|
public async initialize(): Promise<void> {
|
||||||
|
if (this.isInitialized) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
// Initialization logic will be implemented here
|
||||||
|
await this.setupDetector();
|
||||||
|
this.isInitialized = true;
|
||||||
|
}
|
||||||
|
|
||||||
|
public async shutdown(): Promise<void> {
|
||||||
|
if (this.isListening) {
|
||||||
|
await this.stopListening();
|
||||||
|
}
|
||||||
|
if (this.isInitialized) {
|
||||||
|
await this.cleanupDetector();
|
||||||
|
this.isInitialized = false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public async startListening(): Promise<void> {
|
||||||
|
if (!this.isInitialized) {
|
||||||
|
throw new Error("Wake word detector is not initialized");
|
||||||
|
}
|
||||||
|
if (this.isListening) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
await this.startDetection();
|
||||||
|
this.isListening = true;
|
||||||
|
}
|
||||||
|
|
||||||
|
public async stopListening(): Promise<void> {
|
||||||
|
if (!this.isListening) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
await this.stopDetection();
|
||||||
|
this.isListening = false;
|
||||||
|
}
|
||||||
|
|
||||||
|
private async setupDetector(): Promise<void> {
|
||||||
|
// Setup logic will be implemented here
|
||||||
|
await new Promise(resolve => setTimeout(resolve, 100)); // Placeholder
|
||||||
|
}
|
||||||
|
|
||||||
|
private async cleanupDetector(): Promise<void> {
|
||||||
|
// Cleanup logic will be implemented here
|
||||||
|
await new Promise(resolve => setTimeout(resolve, 100)); // Placeholder
|
||||||
|
}
|
||||||
|
|
||||||
|
private async startDetection(): Promise<void> {
|
||||||
|
// Start detection logic will be implemented here
|
||||||
|
await new Promise(resolve => setTimeout(resolve, 100)); // Placeholder
|
||||||
|
}
|
||||||
|
|
||||||
|
private async stopDetection(): Promise<void> {
|
||||||
|
// Stop detection logic will be implemented here
|
||||||
|
await new Promise(resolve => setTimeout(resolve, 100)); // Placeholder
|
||||||
|
}
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user