-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to convert a custom whisper in openai format or HF whisper model to TensorRT based backend ? #58
Comments
for some reference i did refer the TensorRT-LLM repo's whisper example but upon loading the model from the path like this:
i get the following error:
|
@aleksandr-smechov hey thanks for replying, could you please let me know how do i generate the
which just gives me the whisper model in TensorRT_LLM format. is there a way in WhisperS2T code that i can generate the trt_build_args Json file. |
I completely removed that requirement from WhisperS2T code personally, but you can "fake" it by running WhisperS2T normally, finding the cached directory where these files are stored, and adjusting the JSON to your needs. Also remember to rename the encoder and decoder engines from the official example to |
@aleksandr-smechov , thanks a ton for your help. i would try this out you said. |
hey i did as you said, i used the the problem is i get the following error:
despite the the following is the code of explicitly setting the args:
the following is the trt_model_args.json file:
|
@aleksandr-smechov what seems possibly wrong that the following error is triggered ? despite me explicitly even adding the args and the args being present in the json file. |
@StephennFernandes I believe I encountered the same issue before and overcame it by adding these args here. |
@aleksandr-smechov thanks for the heads up,it really means a lot, i was able to fix this issue. by refactoring in 2 places. but the model is stuck and hang up with a new issue. 1. editing the decoder_model_config in the model.py and explicitly adding the 2 args:
2. i had to pull the tokenizer.json file from HF transformers into the dir where my tensorrt_llm model file was saved. post editing all this now the model is stuck / hangup and following are the terminal logs.
for some additional context i am running all of this on NVIDIA A6000, and i am using the TensorRT version |
@aleksandr-smechov @shashikg could this be a version mismatch? as i have built the whisper model to TensorRT using the TensorRT version 9.2.0.5 and whisperS2T expects its own TRT version. i tried building my whisper model on the WhisperS2T official docker image, but i get the following error when builing whisper to TRT format.
|
@StephennFernandes that's correct, you'd need to build the TRT model using the same version of TensorRT-LLM as WhisperS2T uses. |
I tried, but unable to build. i am facing the following error.
does WhisperS2T have a different build structure/format ? I mean to say a totally different build script ? The official build script for whisper, from tensorRT_LLM's repo doesn't work. the following error is from that build script. |
@aleksandr-smechov so i made a dir, placed the .pt model file, hf tokenizer and the trt_build_args.json file into the dir (edited the trt_build_args.json files, output_dir and model_path paths to the current output dir) and launched the script as above. but now the inference code still crashes with a new error.
now i have even built the model on the same official whisperS2T docker image. so it doesn't seem like a TRT versioning issue. the following is the entire stack track of the error:
i tried to move the TRT model files to the cache dir where "whisper-v3" is saved internally. upon replacing and running the code as if i would be running the regular model, the inference works. but adding the path doesn't |
Hi, also very interested on how to integrate a custom finetuned Whisper to whisper_s2t in TensorRT-LLM. Thanks a lot. |
i tried to move the TRT model files to the cache dir where "whisper-v3" is saved internally. upon replacing and running the code as if i would be running the regular model, the inference works. but adding the path doesn't i cannot clearly get it, what could be the issue here ... seems like the issue only gets triggered when the model is called from a path |
Hi @StephennFernandes awesome to hear that it's working for you. As you mentioned, it might be a path issue. I did some major refactoring for my library so it didn't come up as an issue. |
is it possible to update the TensorRT version to support newer models? |
running into this same issue trying to convert |
@eschmidbauer any luck trying to convert v3-turbo ? |
no. the TensorRT-LLM support in |
Hey @shashikg great repo and cheers to the insane efforts in building this repo.
I have a finetuned whisper model (both in original openai and HF formats ) which I want to use in TensorRT backend using WhisperS2T. While I figured out how to load official whisper models, i was wondering how could i convert whisper models to TensorRT and load them using WhisperS2T
The text was updated successfully, but these errors were encountered: