Skip to content

Latest commit

 

History

History
59 lines (37 loc) · 2.34 KB

README.md

File metadata and controls

59 lines (37 loc) · 2.34 KB

funaudiollm-app repo

Welcome to the funaudiollm-app repository! This project hosts two exciting applications leveraging advanced audio understand and speech generation models to bring your audio experiences to life:

Voice Chat : This application is designed to provide an interactive and natural chatting experience, making it easier to adopt sophisticated AI-driven dialogues in various settings.

Voice Translation: Break down language barriers with our real-time voice translation tool. This application seamlessly translates spoken language on the fly, allowing for effective and fluid communication between speakers of different languages.

For Details, visit FunAudioLLM Homepage, CosyVoice Paper, FunAudioLLM Technical Report

For CosyVoice, visit CosyVoice repo and CosyVoice space.

For SenseVoice, visit SenseVoice repo and SenseVoice space.

Install

Clone and install

  • Clone the repo and submodules
git clone --recursive URL
# If you failed to clone submodule due to network failures, please run following command until success
cd funaudiollm-app
git submodule update --init --recursive
  • prepare environments in the submodules according to cosyvoice & sensevoice repo. If you have already prepared the aforementioned resources elsewhere, you can also try modifying the code related to resource path configuration in the app.py file (line 15-18).

  • execute the code below.

pip install -r requirements.txt

Basic Usage

prepare

dashscope api token.

pem file

voice chat

cd voice_chat
sudo CUDA_VISIBLE_DEVICES="0" DS_API_TOKEN="YOUR-DS-API-TOKEN" python app.py >> ./log.txt

https://YOUR-IP-ADDRESS:60001/

voice translation

cd voice_translation
sudo CUDA_VISIBLE_DEVICES="0" DS_API_TOKEN="YOUR-DS-API-TOKEN" python app.py >> ./log.txt

https://YOUR-IP-ADDRESS:60002/