Skip to content
This repository has been archived by the owner on Aug 28, 2023. It is now read-only.

tflite models - segmentation fault #12

Open
cogmeta opened this issue Dec 17, 2022 · 17 comments
Open

tflite models - segmentation fault #12

cogmeta opened this issue Dec 17, 2022 · 17 comments

Comments

@cogmeta
Copy link

cogmeta commented Dec 17, 2022

I tried using tflite models built using the notebook https://github.com/usefulsensors/openai-whisper/blob/main/notebooks/generate_tflite_from_whisper.ipynb.

But, I am getting segmentation fault when tried with stream. Any ideas why that might be happening?

@cogmeta
Copy link
Author

cogmeta commented Dec 17, 2022

n_vocab:50257
audio_sdl_init: found 1 capture devices:
audio_sdl_init: - Capture device #0: 'MacBook Pro Microphone'
audio_sdl_init: attempt to open capture device 0 : 'MacBook Pro Microphone' ...
audio_sdl_init: obtained spec for input device (SDL Id = 2):
audio_sdl_init: - sample rate: 16000
audio_sdl_init: - format: 33056 (required: 33056)
audio_sdl_init: - channels: 1 (required: 1)
audio_sdl_init: - samples per frame: 1024
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Segmentation fault: 11

@nyadla-sys
Copy link
Contributor

Could you please follow the steps outlined here
https://github.com/usefulsensors/openai-whisper/tree/main/stream

Sorry, I don't have macbook pro to verify the same

@cogmeta
Copy link
Author

cogmeta commented Dec 17, 2022

Oh, it works perfectly well with the ../models/whisper.tflite but not with the models created with the notebook. I wanted to try creating tflite with medium model.

@nyadla-sys
Copy link
Contributor

Use the following colab, but swap out openai/whisper-tiny for openai/whisper-medium.
https://colab.research.google.com/github/usefulsensors/openai-whisper/blob/main/notebooks/generate_tflite_from_whisper.ipynb

@cogmeta
Copy link
Author

cogmeta commented Dec 17, 2022

Followed the exact same steps. Trying to use the model results in segmentation faults even on Ubuntu machine. Does not look like mac issue.

@nyadla-sys
Copy link
Contributor

I may need to construct a new multilingual vocab bin,
but it may be worthwhile to test the model below for the existing vocab bin
Use the following GitHub repository, but replace openai/whisper-tiny with openai/whisper-medium.en.

@nyadla-sys
Copy link
Contributor

Before building stream example replace ~/openai-whisper/stream/filters_vocab_gen.h with filters_vocab_multilingual.h
cp ~/openai-whispermodels/filters_vocab_multilingual.h ~/openai-whispermodels/stream/filters_vocab_gen.h
then follow the build steps and run with below command to use whisper-medium.tflite model on stream example
./stream ../models/whisper-medium.tflite

@nyadla-sys
Copy link
Contributor

Use whisper-medium.tflite that you generated

@nyadla-sys
Copy link
Contributor

I am not able to upload model as it is around 700MB in size

@nyadla-sys
Copy link
Contributor

I could add whisper-medium.tflite

@cogmeta
Copy link
Author

cogmeta commented Dec 17, 2022

Thanks! will try it out.

@cogmeta
Copy link
Author

cogmeta commented Dec 17, 2022

./stream ../models/whisper-medium.tflite

n_vocab:50257
audio_sdl_init: found 1 capture devices:
audio_sdl_init: - Capture device #0: 'MacBook Pro Microphone'
audio_sdl_init: attempt to open capture device 0 : 'MacBook Pro Microphone' ...
audio_sdl_init: obtained spec for input device (SDL Id = 2):
audio_sdl_init: - sample rate: 16000
audio_sdl_init: - format: 33056 (required: 33056)
audio_sdl_init: - channels: 1 (required: 1)
audio_sdl_init: - samples per frame: 1024
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
ERROR: gather index out of bounds
ERROR: Node number 35 (GATHER) failed to invoke.
ERROR: Node number 8319 (WHILE) failed to invoke.
Error at /Users/prashantsasatte/openai-whisper/tensorflow_src/tensorflow/lite/examples/stream/stream.cc:366

@nyadla-sys
Copy link
Contributor

Can u try with minimal build and replace filters vocab gen.bin with filters vocab multilingual and rename as filters vocab gen.bin

@nyadla-sys
Copy link
Contributor

I think it is producing more tokens than I restricted to 223 in model generation,pls change max_tokens to 384 and generate model again

@cogmeta
Copy link
Author

cogmeta commented Dec 18, 2022

Tried with minimal build. same error...I have not yet looked into the code. will do that. Thanks.

I am actually looking to implement a streaming GRPC server.

@nyadla-sys
Copy link
Contributor

I have incorporated 448 tokens into the whisper-medium model in an attempt to solve your issue.

@nyadla-sys
Copy link
Contributor

please try latest whisper-medium model

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants