-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for nanotron #11
Conversation
dataset = Dataset.from_list( | ||
[{k: str(v) for k, v in asdict(detail).items()} for detail in task_details] | ||
) | ||
# try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you remove the high level try catch, please add other try catches to prevent the other possible failures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure we want to silently catch mistake or should we not rather let the run fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No because we still want the results to be saved locally. That way we can upload them by hand instead of having to redo the whole eval.
|
||
|
||
class GenerativeTaskDatasetNanotron(DynamicBatchDataset): | ||
def __getitem__(self, index) -> Request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need your own class? (Is it only to return the index with the item?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nathan's requirement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
base_model
does not use the index for each sample, that means that we need to accommodate the dataset to nanotron
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but I'm unsure why we need to grab the index for brr
config_cls: Type = Config, | ||
model_config_cls: Optional[Type] = None, | ||
model_cls: Optional[Type] = None, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation should be added
Support for Nanotron models