voicechat2

A fast, fully local AI Voicechat using WebSockets

WebSocket server, allows for simple remote access
Default web UI w/ VAD using ricky0123/vad, Opus support using symblai/opus-encdec
Modular/swappable SRT, LLM, TTS servers
- SRT: whisper.cpp, faster-whisper, or HF Transformers whisper
- LLM: llama.cpp or any OpenAI API compatible server
- TTS: coqui-tts, StyleTTS2, Piper, MeloTTS

voicechat2.webm

^{Unmute to hear the audio}

On an 7900-class AMD RDNA3 card, voice-to-voice latency is in the 1 second range:

distil-whisper/distil-large-v2
bartowski/Meta-Llama-3.1-8B-Instruct-GGUF (Q4_K_M)
tts_models/en/vctk/vits (Coqui TTS default VITS models)

On a 4090, using Faster Whisper with faster-distil-whisper-large-v2 we can cut the latency down to as low as 300ms:

voicechat2-fw.webm

You can of course run any model or swap out any of the SRT, LLM, TTS components as you like. For example, you can run whisper.cpp for SRT, or we have a StyleTTS2 server in the test folder for an alternative TTS. For a bit more about this project, see my Hackster.io writeup.

Install

These installation instructions are for Ubuntu LTS and assume you've setup your ROCm or CUDA already.

I recommend you use conda or (my preferred), mamba for environment management. It will make your life easier.

System Prereqs

sudo apt update

# Not strictly required but the helpers we use
sudo apt install byobu curl wget

# Audio processing
sudo apt install espeak-ng ffmpeg libopus0 libopus-dev

Checkout code

# Create env
mamba create -y -n voicechat2 python=3.11

# Setup
mamba activate voicechat2
git clone https://github.com/lhl/voicechat2
cd voicechat2
pip install -r requirements.txt

llama.cpp

# Build llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
# AMD version
make GGML_HIPBLAS=1 -j 
# Nvidia version
make GGML_CUDA=1 -j 

# Grab your preferred GGUF model
wget https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q4_K_M.gguf

# If you're going to go to the next instruction
cd ..

Some extra convenience scripts for launching:

run-voicechat2.sh - on your GPU machine, tries to launch all servers in separate byobu sessions; update the MODEL variables
remote-tunnel.sh - connect your GPU machine to a jump machine
local-tunnel.sh - connect to the GPU machine via a jump machine

Other AI Voicechat Projects

Speech To Speech

A project released after voicechat2 that uses a similar modular approach but is local device oriented

https://github.com/eustlb/speech-to-speech
No license?

webrtc-ai-voice-chat

The demo shows a fair amount of latency (~10s) but this project isn't the closest to what we're doing (it uses WebRTC not websockets) from voicechat2 (HF Transformers, Ollama)

https://github.com/lalanikarim/webrtc-ai-voice-chat
Apache 2.0

june

A console-based local client (HF Transformers, Ollama, Coqui TTS, PortAudio)

GlaDOS

This is a very responsive console-based local-client app that also has VAD and interruption support, plus a really clever hook! (whisper.cpp, llama.cpp, piper, espeak)

local-talking-llm

Another console-based local client, more of a proof of concept but with w/ blog writeup.

BUD-E - natural_voice_assistant

Another console-based local client (FastConformer, HF Transformers, StyleTTS2, espeak)

LocalAIVoiceChat

KoljaB has a number of interesting projects around console-based local clients like RealtimeSTT, RealtimeTTS, Linguflex, etc. (faster_whisper, llama.cpp, Coqui XTTS)

https://github.com/KoljaB/LocalAIVoiceChat
NC (Coqui Model License)

rtvi-web-demo

This is not a local voicechat client, but it does have a neat WebRTC front-end, so might be worth poking around into (Vite/React, Tailwind, Radix)

https://github.com/rtvi-ai/rtvi-web-demo

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
docs		docs
test		test
ui		ui
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
local-tunnel.sh		local-tunnel.sh
remote-tunnel.sh		remote-tunnel.sh
requirements.txt		requirements.txt
run-voicechat2.sh		run-voicechat2.sh
srt-server.py		srt-server.py
tts-server.py		tts-server.py
voicechat2.py		voicechat2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

voicechat2

Install

System Prereqs

Checkout code

llama.cpp

Other AI Voicechat Projects

Speech To Speech

webrtc-ai-voice-chat

june

GlaDOS

local-talking-llm

BUD-E - natural_voice_assistant

LocalAIVoiceChat

rtvi-web-demo

About

Releases

Packages

Contributors 2

Languages

License

lhl/voicechat2

Folders and files

Latest commit

History

Repository files navigation

voicechat2

Install

System Prereqs

Checkout code

llama.cpp

Other AI Voicechat Projects

Speech To Speech

webrtc-ai-voice-chat

june

GlaDOS

local-talking-llm

BUD-E - natural_voice_assistant

LocalAIVoiceChat

rtvi-web-demo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages