Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for nanotron #11

Merged
merged 16 commits into from
Feb 7, 2024
Merged

Support for nanotron #11

merged 16 commits into from
Feb 7, 2024

Conversation

thomwolf
Copy link
Member

@thomwolf thomwolf commented Feb 5, 2024

Support for Nanotron models

@NathanHB NathanHB changed the title Fixing quality Fixing quality and greedy untill for brrr Feb 5, 2024
@thomwolf thomwolf changed the title Fixing quality and greedy untill for brrr Support for nanotron Feb 6, 2024
pyproject.toml Show resolved Hide resolved
src/lighteval/utils.py Show resolved Hide resolved
src/lighteval/utils.py Outdated Show resolved Hide resolved
src/lighteval/utils.py Show resolved Hide resolved
src/lighteval/models/nanotron_model.py Outdated Show resolved Hide resolved
dataset = Dataset.from_list(
[{k: str(v) for k, v in asdict(detail).items()} for detail in task_details]
)
# try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove the high level try catch, please add other try catches to prevent the other possible failures

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure we want to silently catch mistake or should we not rather let the run fail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because we still want the results to be saved locally. That way we can upload them by hand instead of having to redo the whole eval.



class GenerativeTaskDatasetNanotron(DynamicBatchDataset):
def __getitem__(self, index) -> Request:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need your own class? (Is it only to return the index with the item?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nathan's requirement

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_model does not use the index for each sample, that means that we need to accommodate the dataset to nanotron.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but I'm unsure why we need to grab the index for brr

.pre-commit-config.yaml Show resolved Hide resolved
run_evals_accelerate.py Show resolved Hide resolved
src/lighteval/main_nanotron.py Outdated Show resolved Hide resolved
src/lighteval/utils.py Outdated Show resolved Hide resolved
config_cls: Type = Config,
model_config_cls: Optional[Type] = None,
model_cls: Optional[Type] = None,
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation should be added

clefourrier and others added 5 commits February 7, 2024 15:04
* add management with env config

* use markdown table generator

* doc of the s3 cleaning function

* fix name
@NathanHB NathanHB merged commit 8aaf51c into main Feb 7, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants