Intermediate Speech Representations for LibriSpeech

SarinaMeyer released this 14 Mar 14:11

c51e64a

This release contains the intermediate representations of linguistic content (phonetic transcription), prosody (pitch, energy, duration), and speaker embedding (GST, trained jointly with TTS) of the pipeline for the LibriSpeech train-clean-360, dev and test data of the VPC 2024. You can significantly reduce the run time of the pipeline by using these precomputed representations instead of computing them from scratch.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermediate Speech Representations for LibriSpeech