openclaw-voice-assistant/README.md

# 🎤 Bilingual Voice Assistant - Google AIY Voice Kit V1

**AI Now Inc - Del Mar Demo Unit**
**Laboratory Assistant:** Claw 🏭

A bilingual (English/Mandarin) voice-activated assistant for Google AIY Voice Kit V1 with music playback capability.

## Features

- ✅ **Bilingual Support** - English and Mandarin Chinese speech recognition
- ✅ **Text-to-Speech** - Respond in the detected language
- ✅ **Music Playback** - Play MP3 files by voice command
- ✅ **Remote Communication** - Connect to OpenClaw assistant via API
- ✅ **Offline Capability** - Basic commands work without internet
- ✅ **Hotword Detection** - "Hey Assistant" / "你好助手" wake word

## Hardware Requirements

- **Google AIY Voice Kit V1** (with Voice HAT)
- **Raspberry Pi** (3B/3B+/4B recommended)
- **MicroSD Card** (8GB+)
- **Speaker** (3.5mm or HDMI audio)
- **Microphone** (included with AIY Kit)
- **Internet Connection** (WiFi/Ethernet)

## Software Architecture

```
┌─────────────────────────────────────────────────────────┐
│  Google AIY Voice Kit V1                                │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │ Hotword     │  │ Speech       │  │ Command      │  │
│  │ Detection   │→ │ Recognition  │→ │ Processing   │  │
│  └─────────────┘  └──────────────┘  └──────────────┘  │
│                          ↓                  ↓           │
│  ┌──────────────────────────────────────────────────┐  │
│  │           Language Detection (en/zh)            │  │
│  └──────────────────────────────────────────────────┘  │
│                          ↓                               │
│  ┌──────────────────────────────────────────────────┐  │
│  │         OpenClaw API Communication               │  │
│  └──────────────────────────────────────────────────┘  │
│                          ↓                               │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │ TTS         │  │ Music Player │  │ Response     │  │
│  │ (en/zh)     │  │ (MP3)        │  │ Handler      │  │
│  └─────────────┘  └──────────────┘  └──────────────┘  │
└─────────────────────────────────────────────────────────┘
```

## Installation

### 1. Setup Google AIY Voice Kit

```bash
# Update system
sudo apt-get update
sudo apt-get upgrade

# Install AIY Voice Kit software
cd ~
git clone https://github.com/google/aiyprojects-raspbian.git
cd aiyprojects-raspbian
bash install.sh
sudo reboot
```

### 2. Install Dependencies

```bash
# Python dependencies
pip3 install google-cloud-speech google-cloud-texttospeech
pip3 install pygame mutagen
pip3 install requests websocket-client
pip3 install langdetect
```

### 3. Configure Google Cloud (Optional - for cloud services)

```bash
# Set up Google Cloud credentials
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"
```

## Configuration

Edit `config.json`:

```json
{
  "openclaw": {
    "enabled": true,
    "ws_url": "ws://192.168.1.100:18790",
    "api_key": "your_api_key"
  },
  "speech": {
    "language": "auto",
    "hotword": "hey assistant|你好助手"
  },
  "music": {
    "library_path": "/home/pi/Music",
    "default_volume": 0.7
  },
  "tts": {
    "english_voice": "en-US-Standard-A",
    "chinese_voice": "zh-CN-Standard-A"
  }
}
```

## Usage

### Start the Assistant

```bash
cd /home/pi/voice-assistant
python3 main.py
```

### Voice Commands

#### General Commands
- "Hey Assistant, what time is it?" / "你好助手，现在几点？"
- "Hey Assistant, how are you?" / "你好助手，你好吗？"
- "Hey Assistant, tell me a joke" / "你好助手，讲个笑话"

#### Music Commands
- "Hey Assistant, play [song name]" / "你好助手，播放 [歌曲名]"
- "Hey Assistant, pause" / "你好助手，暂停"
- "Hey Assistant, resume" / "你好助手，继续"
- "Hey Assistant, stop" / "你好助手，停止"
- "Hey Assistant, next track" / "你好助手，下一首"
- "Hey Assistant, volume up" / "你好助手，音量加大"

#### OpenClaw Commands
- "Hey Assistant, ask Claw: [your question]"
- "你好助手，问 Claw：[你的问题]"

## Project Structure

```
voice-assistant/
├── main.py                 # Main entry point
├── config.json             # Configuration file
├── assistant.py            # Core assistant logic
├── speech_recognizer.py    # Speech recognition (en/zh)
├── tts_engine.py           # Text-to-speech engine
├── music_player.py         # MP3 playback control
├── openclaw_client.py      # OpenClaw API client
├── hotword_detector.py     # Wake word detection
├── requirements.txt        # Python dependencies
└── samples/                # Sample audio files
```

## Language Detection

The system automatically detects the spoken language:

- **English keywords** → English response
- **Chinese keywords** → Mandarin response
- **Mixed input** → Respond in dominant language

## Music Library

Organize your MP3 files:

```
/home/pi/Music/
├── artist1/
│   ├── song1.mp3
│   └── song2.mp3
├── artist2/
│   └── song3.mp3
└── playlist/
    └── favorites.mp3
```

## Advanced Features

### Custom Hotword
Train your own hotword using Porcupine or Snowboy.

### Offline Speech Recognition
Use Vosk or PocketSphinx for offline recognition.

### Multi-room Audio
Stream audio to multiple devices via Snapcast.

### Voice Profiles
Recognize different users and personalize responses.

## Troubleshooting

### Microphone not detected
```bash
arecord -l  # List audio devices
alsamixer  # Check levels
```

### Poor speech recognition
- Speak clearly and closer to the microphone
- Reduce background noise
- Check internet connection for cloud recognition

### Music playback issues
```bash
# Test audio output
speaker-test -t wav

# Check volume
alsamixer
```

## Next Steps

- [ ] Add voice profile recognition
- [ ] Implement offline speech recognition
- [ ] Add Spotify/Apple Music integration
- [ ] Create web UI for music library management
- [ ] Add multi-language support (Spanish, French, etc.)
- [ ] Implement voice commands for industrial control

---

**AI Now Inc** - Del Mar Show Demo Unit
**Contact:** Laboratory Assistant Claw 🏭
**Version:** 1.0.0