Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FT] Adding caching for each dataset run #417

Open
JoelNiklaus opened this issue Dec 2, 2024 · 2 comments
Open

[FT] Adding caching for each dataset run #417

JoelNiklaus opened this issue Dec 2, 2024 · 2 comments
Labels
feature request New feature/request

Comments

@JoelNiklaus
Copy link
Contributor

Issue encountered

When running large evals with many dataset configurations it is very painful to rerun everything in case something fails.

Solution/Feature

It would be great if intermediate results could be cached, for example the computed metrics of each dataset.

@JoelNiklaus JoelNiklaus added the feature request New feature/request label Dec 2, 2024
@punitvara
Copy link

punitvara commented Dec 11, 2024

I am just looking out to contribute in HF repo. Just trying to understand code base and see if I can add this feature.

@clefourrier
Copy link
Member

@punitvara We would want results to be saved after each batch run, and to be reused if an evaluation is launched with the exact same parameter configuration.
TBH, it's not a trivial PR to work on - if you're unfamiliar with lighteval, I would suggest working on #324, #325, or maybe #355 to get to know the code base first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature/request
Projects
None yet
Development

No branches or pull requests

3 participants