docs: Revise README to consolidate core features and enhance speech processing documentation

- Moved core features section to a more prominent position
- Added detailed speech features setup and configuration instructions
- Included additional tools available in the `extra/` directory for enhanced Home Assistant experience
- Removed outdated speech features documentation for clarity
This commit is contained in:
jango-blockchained
2025-03-15 17:02:55 +01:00
parent 90fd0e46f7
commit d1cca04e76

212
README.md
View File

@@ -6,6 +6,13 @@
MCP (Model Context Protocol) Server is my lightweight integration tool for Home Assistant, providing a flexible interface for device management and automation. It's designed to be fast, secure, and easy to use. Built with Bun for maximum performance. MCP (Model Context Protocol) Server is my lightweight integration tool for Home Assistant, providing a flexible interface for device management and automation. It's designed to be fast, secure, and easy to use. Built with Bun for maximum performance.
## Core Features ✨
- 🔌 Basic device control via REST API
- 📡 WebSocket/Server-Sent Events (SSE) for state updates
- 🤖 Simple automation rule management
- 🔐 JWT-based authentication
## Why Bun? 🚀 ## Why Bun? 🚀
I chose Bun as the runtime for several key benefits: I chose Bun as the runtime for several key benefits:
@@ -38,66 +45,6 @@ I chose Bun as the runtime for several key benefits:
- Compatible with Express/Fastify - Compatible with Express/Fastify
- Native Node.js APIs - Native Node.js APIs
## Core Features ✨
- 🔌 Basic device control via REST API
- 📡 WebSocket/Server-Sent Events (SSE) for state updates
- 🤖 Simple automation rule management
- 🔐 JWT-based authentication
- 🎤 Optional speech features:
- 🗣️ Wake word detection ("hey jarvis", "ok google", "alexa")
- 🎯 Speech-to-text using fast-whisper
- 🌍 Multiple language support
- 🚀 GPU acceleration support
## System Architecture 📊
```mermaid
flowchart TB
subgraph Client["Client Applications"]
direction TB
Web["Web Interface"]
Mobile["Mobile Apps"]
Voice["Voice Control"]
end
subgraph MCP["MCP Server"]
direction TB
API["REST API"]
WS["WebSocket/SSE"]
Auth["Authentication"]
subgraph Speech["Speech Processing (Optional)"]
direction TB
Wake["Wake Word Detection"]
STT["Speech-to-Text"]
subgraph STT_Options["STT Options"]
direction LR
Whisper["Whisper"]
FastWhisper["Fast Whisper"]
end
Wake --> STT
STT --> STT_Options
end
end
subgraph HA["Home Assistant"]
direction TB
HASS_API["HASS API"]
HASS_WS["HASS WebSocket"]
Devices["Smart Devices"]
end
Client --> MCP
MCP --> HA
HA --> Devices
style Speech fill:#f9f,stroke:#333,stroke-width:2px
style STT_Options fill:#bbf,stroke:#333,stroke-width:1px
```
## Prerequisites 📋 ## Prerequisites 📋
- 🚀 [Bun runtime](https://bun.sh) (v1.0.26+) - 🚀 [Bun runtime](https://bun.sh) (v1.0.26+)
@@ -135,21 +82,11 @@ NODE_ENV=production ./scripts/setup-env.sh
4. Build and launch with Docker: 4. Build and launch with Docker:
```bash ```bash
# Build options:
# Standard build # Standard build
./docker-build.sh ./docker-build.sh
# Build with speech support
./docker-build.sh --speech
# Build with speech and GPU support
./docker-build.sh --speech --gpu
# Launch: # Launch:
docker compose up -d docker compose up -d
# With speech features:
docker compose -f docker-compose.yml -f docker-compose.speech.yml up -d
``` ```
## Docker Build Options 🐳 ## Docker Build Options 🐳
@@ -213,41 +150,6 @@ Files load in this order:
Later files override earlier ones. Later files override earlier ones.
## Speech Features Setup 🎤
### Prerequisites
1. 🐳 Docker installed and running
2. 🎮 NVIDIA GPU with CUDA (optional)
3. 💾 4GB+ RAM (8GB+ recommended)
### Configuration
1. Enable speech in `.env`:
```bash
ENABLE_SPEECH_FEATURES=true
ENABLE_WAKE_WORD=true
ENABLE_SPEECH_TO_TEXT=true
WHISPER_MODEL_PATH=/models
WHISPER_MODEL_TYPE=base
```
2. Choose your STT engine:
```bash
# For standard Whisper
STT_ENGINE=whisper
# For Fast Whisper (GPU recommended)
STT_ENGINE=fast-whisper
CUDA_VISIBLE_DEVICES=0 # Set GPU device
```
### Available Models 🤖
Choose based on your needs:
- `tiny.en`: Fastest, basic accuracy
- `base.en`: Good balance (recommended)
- `small.en`: Better accuracy, slower
- `medium.en`: High accuracy, resource intensive
- `large-v2`: Best accuracy, very resource intensive
## Development 💻 ## Development 💻
```bash ```bash
@@ -291,29 +193,6 @@ bun run start
- [Custom Prompts Guide](docs/prompts.md) - Create and customize AI behavior - [Custom Prompts Guide](docs/prompts.md) - Create and customize AI behavior
- [Extras & Tools](docs/extras.md) - Additional utilities and advanced features - [Extras & Tools](docs/extras.md) - Additional utilities and advanced features
### Extra Tools 🛠️
I've included several powerful tools in the `extra/` directory to enhance your Home Assistant experience:
1. **Home Assistant Analyzer CLI** (`ha-analyzer-cli.ts`)
- Deep automation analysis using AI models
- Security vulnerability scanning
- Performance optimization suggestions
- System health metrics
2. **Speech-to-Text Example** (`speech-to-text-example.ts`)
- Wake word detection
- Speech-to-text transcription
- Multiple language support
- GPU acceleration support
3. **Claude Desktop Setup** (`claude-desktop-macos-setup.sh`)
- Automated Claude Desktop installation for macOS
- Environment configuration
- MCP integration setup
See [Extras Documentation](docs/extras.md) for detailed usage instructions and examples.
## Client Integration 🔗 ## Client Integration 🔗
### Cursor Integration 🖱️ ### Cursor Integration 🖱️
@@ -354,6 +233,83 @@ Windows users can use the provided script:
1. Go to `scripts` directory 1. Go to `scripts` directory
2. Run `start_mcp.cmd` 2. Run `start_mcp.cmd`
## Additional Features
### Speech Features 🎤
MCP Server optionally supports speech processing capabilities:
- 🗣️ Wake word detection ("hey jarvis", "ok google", "alexa")
- 🎯 Speech-to-text using fast-whisper
- 🌍 Multiple language support
- 🚀 GPU acceleration support
#### Speech Features Setup
##### Prerequisites
1. 🐳 Docker installed and running
2. 🎮 NVIDIA GPU with CUDA (optional)
3. 💾 4GB+ RAM (8GB+ recommended)
##### Configuration
1. Enable speech in `.env`:
```bash
ENABLE_SPEECH_FEATURES=true
ENABLE_WAKE_WORD=true
ENABLE_SPEECH_TO_TEXT=true
WHISPER_MODEL_PATH=/models
WHISPER_MODEL_TYPE=base
```
2. Choose your STT engine:
```bash
# For standard Whisper
STT_ENGINE=whisper
# For Fast Whisper (GPU recommended)
STT_ENGINE=fast-whisper
CUDA_VISIBLE_DEVICES=0 # Set GPU device
```
##### Available Models 🤖
Choose based on your needs:
- `tiny.en`: Fastest, basic accuracy
- `base.en`: Good balance (recommended)
- `small.en`: Better accuracy, slower
- `medium.en`: High accuracy, resource intensive
- `large-v2`: Best accuracy, very resource intensive
##### Launch with Speech Features
```bash
# Build with speech support
./docker-build.sh --speech
# Launch with speech features:
docker compose -f docker-compose.yml -f docker-compose.speech.yml up -d
```
### Extra Tools 🛠️
I've included several powerful tools in the `extra/` directory to enhance your Home Assistant experience:
1. **Home Assistant Analyzer CLI** (`ha-analyzer-cli.ts`)
- Deep automation analysis using AI models
- Security vulnerability scanning
- Performance optimization suggestions
- System health metrics
2. **Speech-to-Text Example** (`speech-to-text-example.ts`)
- Wake word detection
- Speech-to-text transcription
- Multiple language support
- GPU acceleration support
3. **Claude Desktop Setup** (`claude-desktop-macos-setup.sh`)
- Automated Claude Desktop installation for macOS
- Environment configuration
- MCP integration setup
See [Extras Documentation](docs/extras.md) for detailed usage instructions and examples.
## License 📄 ## License 📄
MIT License. See [LICENSE](LICENSE) for details. MIT License. See [LICENSE](LICENSE) for details.