Tessa/callibration script #937

tbarton16 · 2024-02-02T23:46:47Z

Here is code we use to test our benchmark tasks by using a series of progressively more advanced models to see if the benchmarks effectively differentiate between them, and at which number of shots they performed best at.

Select an independent variable and a series of models that correspond to the settings of that variable
Select clusters
Edit the list of tasks in the base_callibration.yaml to reflect the ones you want to see
Run the launcher script
When everything is done, run the analyze_output notebook which collates the results from wandb

maxisawesome · 2024-02-05T22:15:02Z

lgtm! I kinda hate checking in notebooks but I do think it's better than a script in this case.

dakinggg

Given that this notebook/script is mostly for y'all, has lots of hardcoded stuff, etc, lets note in the README that the calibration scripts are experimental and subject to change at any time.

dakinggg · 2024-02-05T23:57:07Z

scripts/eval/callibration/README.md

@@ -0,0 +1,10 @@
+# Callibration


Suggested change

# Callibration

# Calibration

dakinggg · 2024-02-05T23:57:29Z

scripts/eval/callibration/README.md

+A good benchmark is one that clearly shows which models are better and which are worse. We test our benchmark tasks by using a series of progressively more advanced models to see if the benchmarks effectively differentiate between them, and at which number of shots they performed best at.
+
+To run the code:
+* Select an independant variable and a series of models that correspond to the settings of that variable


Suggested change

* Select an independant variable and a series of models that correspond to the settings of that variable

* Select an independent variable and a series of models that correspond to the settings of that variable

throughout

dakinggg · 2024-02-05T23:59:05Z

scripts/eval/callibration/analyze_output.ipynb

I suggest clearing the cell output before committing the notebook.

Easiest thing might be to just add the precommit hook from composer for this

- repo: https://github.com/kynan/nbstripout rev: 0.5.0 hooks: - id: nbstripout types: - "jupyter" args: # Strip all the metadata that vscode or colab may add to a notebook - --strip-empty-cells - --extra-keys - > metadata.colab metadata.interpreter metadata.accelerator metadata.kernelspec metadata.language_info.version cell.metadata.heading_collapsed metadata.name metadata.nbconvert_exporter metadata.version metadata.vscode

dakinggg · 2024-02-06T00:00:36Z

scripts/eval/callibration/base_callibration.yaml

+integrations:
+- integration_type: git_repo
+  git_repo: mosaicml/llm-foundry
+  git_branch: main


should probably be pinned to a release.

dakinggg · 2024-02-06T00:00:44Z

scripts/eval/callibration/base_callibration.yaml

Lets move this to the eval/yamls folder

bmosaicml · 2024-02-09T18:54:14Z

Would you mind adding the MCLI name of a test run you launched so I can go back and describe run and view logs later?

Additionally a screenshot of the resulting notebook would be good so that when I go back to this later I can confirm that I got the correct results?

bmosaicml

Great work! can you just address Daniel's comments as well as update the description as I requested?

Thx Tessa!!!

tbarton16 added 3 commits February 1, 2024 17:41

Callibration Scripts added

03e0eba

add read me

f66a820

Add comments

f986175

tbarton16 requested review from maxisawesome and bmosaicml February 2, 2024 23:46

tbarton16 added 3 commits February 2, 2024 15:47

Merge branch 'main' into tessa/callibration-script

577fe7c

remove hf token

66e0266

pre-commit

bff3d50

maxisawesome approved these changes Feb 5, 2024

View reviewed changes

dakinggg reviewed Feb 6, 2024

View reviewed changes

bmosaicml approved these changes Feb 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tessa/callibration script #937

Tessa/callibration script #937

tbarton16 commented Feb 2, 2024 •

edited

Loading

maxisawesome commented Feb 5, 2024

dakinggg left a comment

dakinggg Feb 5, 2024

dakinggg Feb 5, 2024

dakinggg Feb 5, 2024

dakinggg Feb 5, 2024

dakinggg Feb 6, 2024

dakinggg Feb 6, 2024

bmosaicml commented Feb 9, 2024

bmosaicml left a comment

	* Select an independant variable and a series of models that correspond to the settings of that variable
	* Select an independent variable and a series of models that correspond to the settings of that variable

Tessa/callibration script #937

Are you sure you want to change the base?

Tessa/callibration script #937

Conversation

tbarton16 commented Feb 2, 2024 • edited Loading

maxisawesome commented Feb 5, 2024

dakinggg left a comment

Choose a reason for hiding this comment

dakinggg Feb 5, 2024

Choose a reason for hiding this comment

dakinggg Feb 5, 2024

Choose a reason for hiding this comment

dakinggg Feb 5, 2024

Choose a reason for hiding this comment

dakinggg Feb 5, 2024

Choose a reason for hiding this comment

dakinggg Feb 6, 2024

Choose a reason for hiding this comment

dakinggg Feb 6, 2024

Choose a reason for hiding this comment

bmosaicml commented Feb 9, 2024

bmosaicml left a comment

Choose a reason for hiding this comment

tbarton16 commented Feb 2, 2024 •

edited

Loading