This release contains all models of our latest pipeline version capable of generating artificial speaker embeddings using a GAN, prosody cloning and prosody modifications using offsets.
Place the unzipped folders in a models directory located directly under root. So, the structure should look like follows:
speaker-anonymization
└─ models
└─ anonymization
└─ gan_style-embed
└─ settings.json
└─ style-embed_wgan.pt
└─ asr
└─ asr_branchformer_tts-phn_en.zip
└─ tts
└─ Aligner
└─ aligner.pt
└─ Embedding
└─ embedding_function.pt
└─ FastSpeech2_Multi
└─ prosody_cloning.pt
└─ HiFiGAN_combined
└─ best.pt
Note: Do not unzip the ASR models but keep them as zip folders! They will be unzipped during runtime.