-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unittests for Universal Assisted generation #8
Conversation
Proposing to include some unittests to ensure functionality. I am, however, encountering some errors in these dummy examples. Any inputs into what this test might be missing? The following is the error I am seeing -
|
Thanks @gauravjain14 |
It seems the drafter sampled a token that the translator does not include. Perhaps that token is not in the target vocabulary? |
@gauravjain14 |
3d7a709
to
e18a060
Compare
Rebased on @jmamou's changes in #7. Removed some tests that seemed unnecessary. All tests pass. I have disabled this test for now -
Let me know what y'all think about it. If we should have it I can enable it. Disabled due to the context length. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gauravjain14, thank you, it looks good! I added minor comments.
@classmethod | ||
def setUpClass(cls): | ||
# Setup main and assistant models | ||
cls.main_model = AutoModelForCausalLM.from_pretrained( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it takes about 33 seconds on a T4 machine. I think we can just add the tag @slow
as mentioned in the comment. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using smaller models? There are a few examples of fast models used in existing Hugging Face tests.
@slow
tests run less frequently, so I suggest striving for faster tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gauravj14 @gauravjain14
Models for testing: https://huggingface.co/hf-internal-testing. For example, hf-internal-testing/tiny-random-gpt2
as used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdyt about moving the content of this file to tests/generation/test_candidate_generator.py
?
Here's a quick update - I am running the test 'test_mismatched_vocabularies
The test -
|
According to the target tokenizer, |
Got it. Thanks for the quick response on that. So, how do you propose we handle this for now? Should we skip special tokens in the target vocab or you think this will be a quick fix? |
Let's skip special tokens for now. Note that UAG does not handle that case also. |
8e3d814
to
1a7f420
Compare
1a7f420
to
2652490
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I only added two small comments.
…ate_generator.py`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gauravjain14, Please see the minor comment below.
import threading | ||
import unittest | ||
import weakref | ||
from unittest.mock import MagicMock | ||
|
||
from zmq import device |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gauravjain14 please remove this line
Introduce Unit tests for Universal Assisted Generation
tests/test_universal_assisted_generation.py
is intended to test the functionality introduced by universal assisted generation.Note: All but
test_basic_generation
have been disabled for now.Who can review?
@keyboardAnt @jmamou