-
Notifications
You must be signed in to change notification settings - Fork 534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi eval dataset logging #603
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks good, thanks! Please remove the accidentally committed files, and add a simple unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a unit test that tests two eval loaders with two different datasets?
…ulti-eval-dataset-logging merging
ebb8a30
to
87a92bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just a minor comment about using tmp_path instead
9416cfb
to
3687be2
Compare
Previously, only one eval dataloader was supported in llm-foundry (although multiple ICL eval tasks and the gauntlet were supported). Now users can add custom eval datasets by turning
eval_dataloader
into a list of dataloaders in their yaml. It would look like this:Before:
After:
Users must specify a label for each eval dataloader so that these metrics are logged separately. This functionality was tested on wandb, mlflow, and tensorboard, shown below.
wandb:
mlflow:
tensorboard: