Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read EOS token from model runtime information for speculative_decoding_lm #353

Merged
merged 22 commits into from
Apr 11, 2024

Conversation

anzr299
Copy link
Contributor

@anzr299 anzr299 commented Apr 9, 2024

Extension to issue #277, Added the functionality to read EOS token from model runtime information in the speculative_decoding_lm.

anzr299 and others added 22 commits March 20, 2024 21:39
Made changes to accommodate the dynamic EOS Token
getting the rt_info from the tokenizer instead of the LLM.
Made changes according to the review. This is the latest commit.
Fixed the error Related to the tokenizer read_model
Added comments and organised the code better
Changed Beam Search header to accommodate eos token in parameter and removed the comment regarding eos token not implemented
Added EOS token into the parameters
@anzr299
Copy link
Contributor Author

anzr299 commented Apr 9, 2024

@ilya-lavrenov this is the PR corresponding to the change we discussed.

@anzr299
Copy link
Contributor Author

anzr299 commented Apr 9, 2024

@ilya-lavrenov The readme does not contain proper instruction for running speculative_decoding_lm. in the example, it runs the "Llama-2-7b-chat-hf" model as the main model directory;
/build/Release/speculative_decoding_lm ./TinyLlama-1.1B-Chat-v1.0/pytorch/dldt/FP16/ ./Llama-2-7b-chat-hf/pytorch/dldt/FP16/ "Why is the Sun yellow?"
The previous steps do not mention any installation of this model and it also requires extra permission to use this model(meta-llama/Llama-2-70b-chat-hf)

@ilya-lavrenov ilya-lavrenov merged commit e84defc into openvinotoolkit:master Apr 11, 2024
10 checks passed
@pavel-esir
Copy link
Contributor

The previous steps do not mention any installation of this model and it also requires extra permission to use this model(meta-llama/Llama-2-70b-chat-hf)

Thanks for noticing that. I will update Readme in the next following PRs

@anzr299 anzr299 deleted the patch-2 branch January 8, 2025 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants