-
Notifications
You must be signed in to change notification settings - Fork 27
Multilingual model with spanish #19
Comments
@AlejandroLanaspa Could you please try the below two models on android app,these are multilingual models |
below is the result with the above models using minimal example n_vocab:50257 mel.n_len3000 mel.n_mel:80 [_extra_token_50258][_extra_token_50261][_extra_token_50359][BEG] Für mich sind alle Menschen gleich unabhängig von Geschlecht, sexuelle Orientierung, Religion, Hautfarbe oder Geo-Kordinaten der Geburt.[SOT] mycroft@OpenVoiceOS-e3830c:~/whisper $ minimal models/whisper-base.tflite de_speech_thorsten_sample03_8s.wav n_vocab:50257 mel.n_len3000 mel.n_mel:80 [_extra_token_50258][_extra_token_50261][_extra_token_50358][BEG] For me, all people are equally independent of gender, sex, orientation, religion, hate, or gender coordinates of birth.[SOT] mycroft@OpenVoiceOS-e3830c:~/whisper $ minimal models/whisper-small.tflite de_speech_thorsten_sample03_8s.wav n_vocab:50257 mel.n_len3000 mel.n_mel:80 [_extra_token_50258][_extra_token_50261][_extra_token_50359][BEG] Für mich sind alle Menschen gleich, unabhängig von Geschlecht, sexueller Orientierung, Religion, Hautfarbe oder Geo-Koordinaten der Geburt.[SOT] |
Please make sure to use https://github.com/usefulsensors/openai-whisper/blob/main/models/filters_vocab_multilingual.h instead of English vocab binary |
Thanks for the quick response. Any ideas? |
Add something like below in the native_lib.cpp of Android APP as well
as well pls change filters_vocab_gen.bin with https://github.com/usefulsensors/openai-whisper/blob/main/models/filters_vocab_multilingual.bin
|
We tested with Germany and it is working ,I will try with other language and let you know. |
Thanks for the trick
As per the filters_vocab_gen.bin, I was already replacing it with the filters_vocab_multilingual.bin (changing its name to filters_vocab_gen.bin ) It seems it does not recognize me speaking in spanish :/ |
@AlejandroLanaspa Could you please share the Spanish sample and will test and upload new tflite model which can support spanish language |
Here is a sample https://datasets-server.huggingface.co/assets/common_voice/--/es/train/99/audio/audio.mp3 Others accessible here https://huggingface.co/datasets/common_voice/viewer/es/train Thank you very much, and also for the rest of your work, awesome materials!!! |
I created two tflite models for encoder and decoder and it does multilanguage support. |
I have been trying to follow https://colab.research.google.com/github/usefulsensors/openai-whisper/blob/main/notebooks/generate_tflite_from_whisper.ipynb
to generate a multilingual model that I can use for the android app with spanish detection.
However, when doing so, I was constantly getting the error 'TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType', which I could solve by adding to the model the forced_decoder_ids. Now, this works in the notebook, however, when trying to use it in the android app, I am constantly getting the following error message:
I have generated the tflite by changing the following comand on the notebook
What am I doing wrong?
The text was updated successfully, but these errors were encountered: