-
Notifications
You must be signed in to change notification settings - Fork 534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long context evals using hugging face hosted datasets #709
Long context evals using hugging face hosted datasets #709
Conversation
* relax atol and add retries to reduce flakiness in lion8b timing test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @maxisawesome!
It might be worth passing in the hugging face variables into the get_icl_task_dataloader
function. Maybe add
hf_loading_vars=icl_cfg.get('hf_loading_vars', {}),
hf_parsing_map=icl_cfg.get('hf_parsing_map', {})
in line 304 originally and in 358 in your new commit. These allows you to pass parameters into hugging face's load_dataset
function. In particular, this was helpful in specifying which split of the hugging face dataset, I'd like to evaluate such as hf_loading_vars = {'split': 'train'}
.
outdated. Main content of this branch was merged here: #925 |
Not ready to merge!
Adds long context eval tasks for naively support for LEval QA tasks as well as generated long context tasks (that are padded up to 2k, 4k, and 8k token context length). Both have been uploaded to HF datasets to avoid checking large files into our repo.
LEval:
Supported tasks are in
scripts/eval/yamls/leval_tasks.yaml
.Adds very basic HF dataset parsing. I wrote a specific function for LEval tasks. Eventually, this per-dataset parsing that will be required for most arbitrary HF tasks should likely live in Composer. Otherwise, the yaml logic is as follows:
llm-foundry will remove the
hf://
fromdataset_uri
and load that dataset (here, it will load maxisawesome/long_context_eval).Everything under
hf_vars
will be passed into theload_dataset
func as keyword args.llm-foundry will concatenate together the HF dataset columns listed under
hf_cols.inputs
as the context for the model.llm-foundry will concatenate together the HF dataset columns listed under
hf_cols.outputs
as the expected answer.If
pivot_col
is specified underhf_cols
, will treat each row in the dataset aspivot_col
being the main context,inputs
being the instruction, andoutputs
being the desired answer.(For clarity, LEval has many tasks set up where one row consists of one col of 15 questions, one col of a single document, and one col of 15 answers. The current form of this setup is not the final version, just a temporary working solution.)
Previous notes for generated tasks:
Caveats:
generation scripts for these datasets are not included.