Install - Voicebot

Choose Your Path

Recommended

Quick Install

For end users who want to get started immediately. Automatically downloads required models.

curl -fsSL https://github.com/madcato/voicebot/releases/latest/download/install.sh | sh

Developer Build

For developers who want to build from source and customize configuration.

git clone https://github.com/madcato/voicebot.git
cd voicebot
cp .env.example .env
# Edit .env with your settings
cargo build --release

System Requirements

Operating System

macOS 12.0+ (Big Sur or later)

Apple Silicon (M-series) recommended for optimal performance

Rust Toolchain

Stable Rust required

rustup install stable

Optional Dependencies

Kokoro TTS: brew install espeak-ng
MCP Servers: brew install node

Step-by-Step Developer Setup

Clone the Repository

git clone https://github.com/madcato/voicebot.git
cd voicebot

Configure Environment

Copy the example configuration and customize:

cp .env.example .env
nano .env

Minimum required settings:

Variable	Default	Description
`WHISPER_MODEL`	-	Path to Whisper .bin model
`LLM_URL`	`http://127.0.0.1:8000`	LLM server URL
`LLM_MODEL`	`local-model`	Model name/path

Example .env configuration:

WHISPER_MODEL=./models/ggml-small.bin
WHISPER_COREML=0
LLM_URL=http://127.0.0.1:8000
LLM_MODEL=mlx-community/Qwen3-8B-4bit
TTS_PROVIDER=avspeech
AVSPEECH_VOICE="Jorge (Enhanced)"
AVSPEECH_RATE=0.55
VOICEBOT_LANGUAGE=es

Start the LLM Server

Voicebot requires an external LLM server. We recommend mlx-lm for Apple Silicon.

Option A: Using the helper script

./scripts/start-mlx-lm.sh mlx-community/Qwen3-8B-4bit

Option B: Manual start

mlx_lm.server \
  --model mlx-community/Qwen3-8B-4bit \
  --host 127.0.0.1 --port 8000 \
  --prompt-cache-size 1 \
  --chat-template-args '{"enable_thinking": false}'

Option C: oMLX (alternative)

./scripts/start-omlx.sh ~/models
# Set in .env: LLM_URL=http://127.0.0.1:8001

Build & Run

Standard build (AVSpeech TTS - macOS default)

cargo build --release
cargo run --release

With Kokoro TTS (high-quality, ONNX-based)

cargo build --features kokoro --release
TTS_PROVIDER=kokoro cargo run --features kokoro --release

With Terminal UI

cargo build --features tui --release
cargo run --features tui --release

With HTTP Control API + SSE

cargo build --features control --release
CONTROL_PORT=9001 cargo run --features control --release

Model Setup

The installer automatically downloads required models. For manual setup:

Model	Purpose	Source
Whisper STT	Speech-to-text	HuggingFace (ggml-small.bin)
Silero VAD	Voice activity detection	sherpa-onnx (ggml-silero-vad.bin)
Kokoro TTS	Text-to-speech (Linux)	Kokoro GitHub release

Manual Model Downloads

# Whisper STT model
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin -O ./models/ggml-small.bin

# Silero VAD model
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx -O ./models/ggml-silero-vad.bin

# Optional: Kokoro TTS models
wget https://github.com/leloykun/kokoro/releases/download/v1.0/kokoro-v1.0.onnx -O ./models/kokoro-v1.0.onnx
wget https://github.com/leloykun/kokoro/releases/download/v1.0/voices-v1.0.bin -O ./models/voices-v1.0.bin

Troubleshooting

"No audio device found"

# List available devices
cargo run -- --list-devices

# Then set in .env:
AUDIO_INPUT_DEVICE="Microphone"
AUDIO_OUTPUT_DEVICE="Speaker"

# For multiple matches, use index suffix:
AUDIO_INPUT_DEVICE="Poly Sync 20-M#0"

TTS not working

AVSpeech: Check voices with say -v ?
Kokoro: Ensure models exist and espeak-ng is installed
Check feature flags: --features avspeech or --features kokoro

High latency

Reduce VAD_SILENCE_MS to 150-200ms
Use CoreML STT: WHISPER_COREML=1
Verify LLM server has Metal acceleration
Check performance logs: RUST_LOG=performance=debug cargo run

Next Steps

Once installed, learn about how Voicebot works or check out the contributing guidelines.