Skip to content

PaddleSpeech CLI Model List

zxcd edited this page Jan 10, 2023 · 8 revisions

PaddleSpeech CLI Model List

ASR

Model Code-Switch Language Sample Rate
conformer_wenetspeech False zh 16k
conformer_online_wenetspeech False zh 16k
conformer_u2pp_online_wenetspeech False zh 16k
conformer_online_multicn False zh 16k
conformer_aishell False zh 16k
conformer_online_aishell False zh 16k
transformer_librispeech False en 16k
deepspeech2online_wenetspeech False zh 16k
deepspeech2offline_aishell False zh 16k
deepspeech2online_aishell False zh 16k
deepspeech2offline_librispeech False en 16k
conformer_talcs True zh_en 16k

TTS

Task Model Dataset Lang Speaker
AM FastSpeech2 CSMSC zh single, female
AM FastSpeech2 AISHELL3 zh multi-speaker
AM FastSpeech2 LJSpeech en single, female
AM FastSpeech2 VCTK en multi-speaker
AM FastSpeech2-cnndecoder CSMSC zh single, female
AM FastSpeech2-mix - mix -
AM FastSpeech2-male - zh -
AM SpeedySpeech CSMSC zh single, female
AM Tacotron2 CSMSC zh single, female
AM Tacotron2 LJSpeech en single, female
VOC Parallel WaveGAN CSMSC zh single, female
VOC Parallel WaveGAN AISHELL3 zh multi-speaker
VOC Parallel WaveGAN LJSpeech en single, female
VOC Parallel WaveGAN VCTK en multi-speaker
VOC Parallel WaveGAN-male - zh -
VOC Multi Band MelGAN CSMSC zh single, female
VOC Style MelGAN CSMSC zh single, female
VOC HiFiGAN CSMSC zh single, female
VOC HiFiGAN LJSpeech en single, female
VOC HiFiGAN AISHELL3 zh multi-speaker
VOC HiFiGAN VCTK en multi-speaker
VOC WaveRNN CSMSC zh single, female

CLS

Task Model Dataset Sample Rate
CLS panns_cnn6 Audioset 32k
CLS panns_cnn10 Audioset 32k
CLS panns_cnn14 Audioset 32k

Speaker Verification

Task Model Dataset Sample Rate
Speaker Verification ECAPA-TDNN VoxCeleb 16k

ST

Task Model Dataset
en_to_zh fat_st_ted Ted-En-Zh

Text

Task Model Dataset Lang
punc ernie_linear_p7_wudao IWSLT2012-Zh zh
punc ernie_linear_p3_wudao IWSLT2012-Zh zh
punc ernie_linear_p3_wudao_fast IWSLT2012-Zh zh

KWS

Task Model Dataset Lang
KWS MDTC HeySnips en

SSL

Model Language Sample Rate
wav2vec2 en 16k
wav2vec2ASR_librispeech en 16k
wav2vec2 zh 16k
wav2vec2ASR_aishell1 zh 16k

Whisper

Model Size Multilingual Language Sample Rate
whisper large multilingual - 16k
whisper base - en 16k
whisper base multilingual - 16k
whisper medium - en 16k
whisper medium multilingual - 16k
whisper small - en 16k
whisper small multilingual - 16k
whisper tiny - en 16k
whisper tiny multilingual - 16k