Skip to content

An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".

Notifications You must be signed in to change notification settings

francislata/unicats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding

Overview

This project is an unofficial implementation of UniCATS paper.

Please note that as the official implementation has been released for the CTX-vec2wav model, this repository will be using the same setup. This provides consistency and compatibility for future updates to the project.

Note: Please refer to the official implementations of CTX-text2vec and CTX-vec2wav.

Setup

To get started, run the following after going inside the repository's root directory:

pip install -e .

Dataset

This project is using the LibriTTS dataset in the 24 kHz sampling rate. To follow the same dataset splits as in the paper, please follow the steps on this guide.

Credits

Citation

@article{du2023unicats,
  title={UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding},
  author={Du, Chenpeng and Guo, Yiwei and Shen, Feiyu and Liu, Zhijun and Liang, Zheng and Chen, Xie and Wang, Shuai and Zhang, Hui and Yu, Kai},
  journal={arXiv preprint arXiv:2306.07547},
  year={2023}
}

About

An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published