# 🎤 Bilingual Voice Assistant - Google AIY Voice Kit V1 **AI Now Inc - Del Mar Demo Unit** **Laboratory Assistant:** Claw 🏭 A bilingual (English/Mandarin) voice-activated assistant for Google AIY Voice Kit V1 with music playback capability. ## Features - ✅ **Bilingual Support** - English and Mandarin Chinese speech recognition - ✅ **Text-to-Speech** - Respond in the detected language - ✅ **Music Playback** - Play MP3 files by voice command - ✅ **Remote Communication** - Connect to OpenClaw assistant via API - ✅ **Offline Capability** - Basic commands work without internet - ✅ **Hotword Detection** - "Hey Assistant" / "你好助手" wake word ## Hardware Requirements - **Google AIY Voice Kit V1** (with Voice HAT) - **Raspberry Pi** (3B/3B+/4B recommended) - **MicroSD Card** (8GB+) - **Speaker** (3.5mm or HDMI audio) - **Microphone** (included with AIY Kit) - **Internet Connection** (WiFi/Ethernet) ## Software Architecture ``` ┌─────────────────────────────────────────────────────────┐ │ Google AIY Voice Kit V1 │ │ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Hotword │ │ Speech │ │ Command │ │ │ │ Detection │→ │ Recognition │→ │ Processing │ │ │ └─────────────┘ └──────────────┘ └──────────────┘ │ │ ↓ ↓ │ │ ┌──────────────────────────────────────────────────┐ │ │ │ Language Detection (en/zh) │ │ │ └──────────────────────────────────────────────────┘ │ │ ↓ │ │ ┌──────────────────────────────────────────────────┐ │ │ │ OpenClaw API Communication │ │ │ └──────────────────────────────────────────────────┘ │ │ ↓ │ │ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ TTS │ │ Music Player │ │ Response │ │ │ │ (en/zh) │ │ (MP3) │ │ Handler │ │ │ └─────────────┘ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────┘ ``` ## Installation ### 1. Setup Google AIY Voice Kit ```bash # Update system sudo apt-get update sudo apt-get upgrade # Install AIY Voice Kit software cd ~ git clone https://github.com/google/aiyprojects-raspbian.git cd aiyprojects-raspbian bash install.sh sudo reboot ``` ### 2. Install Dependencies ```bash # Python dependencies pip3 install google-cloud-speech google-cloud-texttospeech pip3 install pygame mutagen pip3 install requests websocket-client pip3 install langdetect ``` ### 3. Configure Google Cloud (Optional - for cloud services) ```bash # Set up Google Cloud credentials export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json" ``` ## Configuration Edit `config.json`: ```json { "openclaw": { "enabled": true, "ws_url": "ws://192.168.1.100:18790", "api_key": "your_api_key" }, "speech": { "language": "auto", "hotword": "hey assistant|你好助手" }, "music": { "library_path": "/home/pi/Music", "default_volume": 0.7 }, "tts": { "english_voice": "en-US-Standard-A", "chinese_voice": "zh-CN-Standard-A" } } ``` ## Usage ### Start the Assistant ```bash cd /home/pi/voice-assistant python3 main.py ``` ### Voice Commands #### General Commands - "Hey Assistant, what time is it?" / "你好助手,现在几点?" - "Hey Assistant, how are you?" / "你好助手,你好吗?" - "Hey Assistant, tell me a joke" / "你好助手,讲个笑话" #### Music Commands - "Hey Assistant, play [song name]" / "你好助手,播放 [歌曲名]" - "Hey Assistant, pause" / "你好助手,暂停" - "Hey Assistant, resume" / "你好助手,继续" - "Hey Assistant, stop" / "你好助手,停止" - "Hey Assistant, next track" / "你好助手,下一首" - "Hey Assistant, volume up" / "你好助手,音量加大" #### OpenClaw Commands - "Hey Assistant, ask Claw: [your question]" - "你好助手,问 Claw:[你的问题]" ## Project Structure ``` voice-assistant/ ├── main.py # Main entry point ├── config.json # Configuration file ├── assistant.py # Core assistant logic ├── speech_recognizer.py # Speech recognition (en/zh) ├── tts_engine.py # Text-to-speech engine ├── music_player.py # MP3 playback control ├── openclaw_client.py # OpenClaw API client ├── hotword_detector.py # Wake word detection ├── requirements.txt # Python dependencies └── samples/ # Sample audio files ``` ## Language Detection The system automatically detects the spoken language: - **English keywords** → English response - **Chinese keywords** → Mandarin response - **Mixed input** → Respond in dominant language ## Music Library Organize your MP3 files: ``` /home/pi/Music/ ├── artist1/ │ ├── song1.mp3 │ └── song2.mp3 ├── artist2/ │ └── song3.mp3 └── playlist/ └── favorites.mp3 ``` ## Advanced Features ### Custom Hotword Train your own hotword using Porcupine or Snowboy. ### Offline Speech Recognition Use Vosk or PocketSphinx for offline recognition. ### Multi-room Audio Stream audio to multiple devices via Snapcast. ### Voice Profiles Recognize different users and personalize responses. ## Troubleshooting ### Microphone not detected ```bash arecord -l # List audio devices alsamixer # Check levels ``` ### Poor speech recognition - Speak clearly and closer to the microphone - Reduce background noise - Check internet connection for cloud recognition ### Music playback issues ```bash # Test audio output speaker-test -t wav # Check volume alsamixer ``` ## Next Steps - [ ] Add voice profile recognition - [ ] Implement offline speech recognition - [ ] Add Spotify/Apple Music integration - [ ] Create web UI for music library management - [ ] Add multi-language support (Spanish, French, etc.) - [ ] Implement voice commands for industrial control --- **AI Now Inc** - Del Mar Show Demo Unit **Contact:** Laboratory Assistant Claw 🏭 **Version:** 1.0.0