Basic evaluate CLI command / codepath #2188

djsaunde · 2024-12-13T21:52:54Z

Description

Title.

Motivation and Context

I'd like to use this codepath to validate my differential transformers work. I.e., when converting a base model to one where we've swapped out the attentions for (zero-init) differential attentions, we expect the loss to be the same as before the conversion.

This will be generally useful for evaluating models post training, comparing models apples-to-apples, evaluating on different datasets, etc.

How has this been tested?

Only basic testing in Runpod on a A40 instance with the axolotlai/axolotl-cloud:main-latest image.
For example, with this simple config:

base_model: HuggingFaceTB/SmolLM2-135M
datasets:
  - path: mhenrichsen/alpaca_2k_test
    type: alpaca
test_datasets:
  - path: mhenrichsen/alpaca_2k_test
    type: alpaca
    split: train
sdp_attention: true
gradient_accumulation_steps: 1
learning_rate: 1e-4
max_steps: 1
micro_batch_size: 1
sequence_len: 2048
special_tokens:
  pad_token: <|endoftext|>

We get:

# axolotl evaluate ../smollm.yaml 
...
[2024-12-13 21:50:17,302] [INFO] [axolotl.eval.evaluate:88] [PID:72580] [RANK:0] Starting evaluation...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [00:56<00:00, 35.47it/s]
[2024-12-13 21:51:14,254] [INFO] [axolotl.eval.evaluate:101] [PID:72580] [RANK:0] Evaluation completed!
[2024-12-13 21:51:14,254] [INFO] [axolotl.eval.evaluate:102] [PID:72580] [RANK:0] Metrics:
[2024-12-13 21:51:14,254] [INFO] [axolotl.eval.evaluate:104] [PID:72580] [RANK:0] eval_loss: 1.8818285465240479
[2024-12-13 21:51:14,254] [INFO] [axolotl.eval.evaluate:104] [PID:72580] [RANK:0] eval_model_preparation_time: 0.0047
[2024-12-13 21:51:14,254] [INFO] [axolotl.eval.evaluate:104] [PID:72580] [RANK:0] eval_runtime: 56.8085
[2024-12-13 21:51:14,254] [INFO] [axolotl.eval.evaluate:104] [PID:72580] [RANK:0] eval_samples_per_second: 35.206
[2024-12-13 21:51:14,255] [INFO] [axolotl.eval.evaluate:104] [PID:72580] [RANK:0] eval_steps_per_second: 35.206
[2024-12-13 21:51:14,255] [INFO] [axolotl.eval.evaluate:116] [PID:72580] [RANK:0] Evaluation results saved to model-out/eval_results.txt

Types of changes

Adding a new top-level module axolotl.evaluate as well as a corresponding CLI module axolotl.cli.evaluate.

~~Test coverage is still outstanding.~~ Done.

~~I'd also like to add evaluation on the training dataset (if passed).~~ Done.

winglian

Minor nit on the print statement. Would be good to figure out a way to document that some of the evaluate/train code is duplicated so that we know to make changes in the other down the line. Otherwise lgtm

src/axolotl/cli/__init__.py

* basic evaluate CLI command / codepath * tests for evaluate CLI command * fixes and cleanup * review comments; slightly DRYing up things --------- Co-authored-by: Dan Saunders <[email protected]>

djsaunde self-assigned this Dec 13, 2024

djsaunde added the enhancement New feature or request label Dec 13, 2024

djsaunde requested review from winglian and NanoCode012 and removed request for winglian and NanoCode012 December 16, 2024 17:03

djsaunde marked this pull request as ready for review December 16, 2024 18:40

djsaunde requested review from winglian, NanoCode012 and bursteratom December 16, 2024 18:40

winglian approved these changes Dec 16, 2024

View reviewed changes

src/axolotl/cli/__init__.py Outdated Show resolved Hide resolved

Dan Saunders added 4 commits December 16, 2024 15:12

basic evaluate CLI command / codepath

2cda884

tests for evaluate CLI command

5918dc2

fixes and cleanup

2fb3ed5

review comments; slightly DRYing up things

edfee9e

djsaunde force-pushed the eval branch from d2df2a1 to edfee9e Compare December 16, 2024 20:12

djsaunde merged commit f865464 into main Dec 16, 2024
8 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic evaluate CLI command / codepath #2188

Basic evaluate CLI command / codepath #2188

djsaunde commented Dec 13, 2024 •

edited

Loading

winglian left a comment

Basic evaluate CLI command / codepath #2188

Basic evaluate CLI command / codepath #2188

Conversation

djsaunde commented Dec 13, 2024 • edited Loading

Description

Motivation and Context

How has this been tested?

Types of changes

winglian left a comment

Choose a reason for hiding this comment

djsaunde commented Dec 13, 2024 •

edited

Loading