2026-03-01 12:19:10 -08:00

🎤 Bilingual Voice Assistant - Google AIY Voice Kit V1

AI Now Inc - Del Mar Demo Unit
Laboratory Assistant: Claw 🏭

A bilingual (English/Mandarin) voice-activated assistant for Google AIY Voice Kit V1 with music playback capability.

Features

  • Bilingual Support - English and Mandarin Chinese speech recognition
  • Text-to-Speech - Respond in the detected language
  • Music Playback - Play MP3 files by voice command
  • Remote Communication - Connect to OpenClaw assistant via API
  • Offline Capability - Basic commands work without internet
  • Hotword Detection - "Hey Assistant" / "你好助手" wake word

Hardware Requirements

  • Google AIY Voice Kit V1 (with Voice HAT)
  • Raspberry Pi (3B/3B+/4B recommended)
  • MicroSD Card (8GB+)
  • Speaker (3.5mm or HDMI audio)
  • Microphone (included with AIY Kit)
  • Internet Connection (WiFi/Ethernet)

Software Architecture

┌─────────────────────────────────────────────────────────┐
│  Google AIY Voice Kit V1                                │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │ Hotword     │  │ Speech       │  │ Command      │  │
│  │ Detection   │→ │ Recognition  │→ │ Processing   │  │
│  └─────────────┘  └──────────────┘  └──────────────┘  │
│                          ↓                  ↓           │
│  ┌──────────────────────────────────────────────────┐  │
│  │           Language Detection (en/zh)            │  │
│  └──────────────────────────────────────────────────┘  │
│                          ↓                               │
│  ┌──────────────────────────────────────────────────┐  │
│  │         OpenClaw API Communication               │  │
│  └──────────────────────────────────────────────────┘  │
│                          ↓                               │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │ TTS         │  │ Music Player │  │ Response     │  │
│  │ (en/zh)     │  │ (MP3)        │  │ Handler      │  │
│  └─────────────┘  └──────────────┘  └──────────────┘  │
└─────────────────────────────────────────────────────────┘

Installation

1. Setup Google AIY Voice Kit

# Update system
sudo apt-get update
sudo apt-get upgrade

# Install AIY Voice Kit software
cd ~
git clone https://github.com/google/aiyprojects-raspbian.git
cd aiyprojects-raspbian
bash install.sh
sudo reboot

2. Install Dependencies

# Python dependencies
pip3 install google-cloud-speech google-cloud-texttospeech
pip3 install pygame mutagen
pip3 install requests websocket-client
pip3 install langdetect

3. Configure Google Cloud (Optional - for cloud services)

# Set up Google Cloud credentials
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"

Configuration

Edit config.json:

{
  "openclaw": {
    "enabled": true,
    "ws_url": "ws://192.168.1.100:18790",
    "api_key": "your_api_key"
  },
  "speech": {
    "language": "auto",
    "hotword": "hey assistant|你好助手"
  },
  "music": {
    "library_path": "/home/pi/Music",
    "default_volume": 0.7
  },
  "tts": {
    "english_voice": "en-US-Standard-A",
    "chinese_voice": "zh-CN-Standard-A"
  }
}

Usage

Start the Assistant

cd /home/pi/voice-assistant
python3 main.py

Voice Commands

General Commands

  • "Hey Assistant, what time is it?" / "你好助手,现在几点?"
  • "Hey Assistant, how are you?" / "你好助手,你好吗?"
  • "Hey Assistant, tell me a joke" / "你好助手,讲个笑话"

Music Commands

  • "Hey Assistant, play [song name]" / "你好助手,播放 [歌曲名]"
  • "Hey Assistant, pause" / "你好助手,暂停"
  • "Hey Assistant, resume" / "你好助手,继续"
  • "Hey Assistant, stop" / "你好助手,停止"
  • "Hey Assistant, next track" / "你好助手,下一首"
  • "Hey Assistant, volume up" / "你好助手,音量加大"

OpenClaw Commands

  • "Hey Assistant, ask Claw: [your question]"
  • "你好助手,问 Claw[你的问题]"

Project Structure

voice-assistant/
├── main.py                 # Main entry point
├── config.json             # Configuration file
├── assistant.py            # Core assistant logic
├── speech_recognizer.py    # Speech recognition (en/zh)
├── tts_engine.py           # Text-to-speech engine
├── music_player.py         # MP3 playback control
├── openclaw_client.py      # OpenClaw API client
├── hotword_detector.py     # Wake word detection
├── requirements.txt        # Python dependencies
└── samples/                # Sample audio files

Language Detection

The system automatically detects the spoken language:

  • English keywords → English response
  • Chinese keywords → Mandarin response
  • Mixed input → Respond in dominant language

Music Library

Organize your MP3 files:

/home/pi/Music/
├── artist1/
│   ├── song1.mp3
│   └── song2.mp3
├── artist2/
│   └── song3.mp3
└── playlist/
    └── favorites.mp3

Advanced Features

Custom Hotword

Train your own hotword using Porcupine or Snowboy.

Offline Speech Recognition

Use Vosk or PocketSphinx for offline recognition.

Multi-room Audio

Stream audio to multiple devices via Snapcast.

Voice Profiles

Recognize different users and personalize responses.

Troubleshooting

Microphone not detected

arecord -l  # List audio devices
alsamixer  # Check levels

Poor speech recognition

  • Speak clearly and closer to the microphone
  • Reduce background noise
  • Check internet connection for cloud recognition

Music playback issues

# Test audio output
speaker-test -t wav

# Check volume
alsamixer

Next Steps

  • Add voice profile recognition
  • Implement offline speech recognition
  • Add Spotify/Apple Music integration
  • Create web UI for music library management
  • Add multi-language support (Spanish, French, etc.)
  • Implement voice commands for industrial control

AI Now Inc - Del Mar Show Demo Unit
Contact: Laboratory Assistant Claw 🏭
Version: 1.0.0

Description
Bilingual Voice Assistant for Google AIY Voice Kit V1 - English/Mandarin support with 'Hey Osiris' hotword detection
Readme 106 KiB
Languages
Python 81.6%
Shell 18.4%