-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Universal Speculative Decoding CandidateGenerator
#35029
Open
keyboardAnt
wants to merge
72
commits into
huggingface:main
Choose a base branch
from
keyboardAnt:usd
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+581
−16
Open
Changes from all commits
Commits
Show all changes
72 commits
Select commit
Hold shift + click to select a range
aa7e01a
move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new t…
keyboardAnt f6b7f20
refactor
keyboardAnt 0ded37c
NOTHING. add space to rerun github actions tests
keyboardAnt d48b69b
remove it...
keyboardAnt b47e33a
`UniversalSpeculativeDecodingGenerator`
keyboardAnt 8a99129
Use `UniversalSpeculativeDecodingGenerator` when `generation_config.d…
keyboardAnt 4649bd2
assistant tokenizes only the target's new suffix
keyboardAnt f199c94
formatting
keyboardAnt 19c0057
fix code
jmamou acf5a4b
fix code
jmamou 3712117
formatting
keyboardAnt 63f2f46
add `TestGenerateWithDifferentModels`
keyboardAnt 6ac33f1
`TestGenerateWithDifferentModels` parameterize on `do_sample`
keyboardAnt 6938311
`AssistantVocabMapping` & `AssistantVocabMappingCache`
keyboardAnt 5a0db3b
formatting
keyboardAnt 92f8ad3
`AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_l…
keyboardAnt 7c8708e
improve `_get_assistant_to_target_input_ids` & formatting
keyboardAnt 880d0ae
renaming
keyboardAnt d9b5e74
WIP: debugging `min_new_tokens`
keyboardAnt 25974d5
fix get_target_ids
jmamou b8636ab
`UniversalSpeculativeDecodingGenerator`
keyboardAnt 1ef46b7
assistant tokenizes only the target's new suffix
keyboardAnt f8e94eb
formatting
keyboardAnt 439db84
fix code
jmamou 643901d
fix code
jmamou 77097ff
formatting
keyboardAnt d08b4f0
`TestGenerateWithDifferentModels` parameterize on `do_sample`
keyboardAnt f242dc1
`AssistantVocabMapping` & `AssistantVocabMappingCache`
keyboardAnt ede1176
formatting
keyboardAnt 511ee96
`AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_l…
keyboardAnt 5e47945
improve `_get_assistant_to_target_input_ids` & formatting
keyboardAnt 25a4349
renaming
keyboardAnt 95fe744
WIP: debugging `min_new_tokens`
keyboardAnt 0ad88b2
fix get_target_ids
jmamou bc5fa61
fix device issue
jmamou 41a5670
fix get_assistant_input_ids
jmamou 44f7ba7
add `TestAssistedCandidateGeneratorDifferentTokenizers`
keyboardAnt 57aafcc
formatting
keyboardAnt 6f95c33
`AssistantVocabTranslatorCache` refactor & tests
keyboardAnt 078f763
revert changes in `src/transformers/generation/logits_process.py`
keyboardAnt faac2fc
refactor `AssistedCandidateGenerator`
keyboardAnt 76a2dd3
refactor `AssistedCandidateGeneratorDifferentTokenizers`
keyboardAnt 43e96e7
formatting
keyboardAnt e63cb9d
refactor `UniversalSpeculativeDecodingGenerator`
keyboardAnt 8aa6020
fix negative value for max_new_tokens
jmamou 2169973
fix generation length target + attention_mask vs. assistant + attent
jmamou c6da827
fix device
jmamou 2cf9e8e
fix negative max_new_tokens bug
jmamou a1c0d05
fix UAG
jmamou d830091
minor
jmamou 19d0cce
formatting
keyboardAnt 5b8217d
`AssistedCandidateGeneratorDifferentTokenizers` `lookbehind`s init
keyboardAnt 9b0126a
resolve conflict & formatting
keyboardAnt 578d0b3
rerun CI tests
keyboardAnt 7db2695
remove space...
keyboardAnt fb69900
remove old code
keyboardAnt e40c775
fix candidate_input_ids device
jmamou b5ce873
minor
jmamou bfccdea
Merge pull request #4 from keyboardAnt/fix_device
keyboardAnt d34d7ea
formatting
keyboardAnt 9d4d9f9
Fix prepare + apply (#7)
jmamou 4e92e9c
Add unittests for Universal Assisted generation
gauravj14 3fe2d31
Merge branch 'main' into usd
jmamou a350b1c
fix style
jmamou e047adf
update tests
jmamou 011f595
Remove unused import and fix `test_speculation_depth` test
gauravjain14 2652490
exclude special and reserved tokens from tokenizer for UAG
gauravjain14 701edbb
mv `test_universal_assisted_generation.py` to `generation/test_candid…
gauravjain14 7088978
Merge pull request #8 from keyboardAnt/unit_tests_usd
gauravjain14 3b89341
Remove unused imports and fix style using `make style` (#9)
gauravjain14 e43dba8
formatting
keyboardAnt a529795
Swap gated `meta-llama/llama-3.2` with `allenai/llama` (#10)
gauravjain14 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, converting only new tokens and concatenating with old assistant ids here means that sometimes the total assistant ids might not be the actual tokenization of input text, isnt' it? Since we are hitting the token boundaries and can be experiencing some discrepancies. I see in the UAG we have a small window that shift target ids before reencoding them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zucchini-nlp
Unlike UAG, USD has no discrepancies, as all tokens validated by the target are guaranteed to be present in the draft vocab.