Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for auto packing ratio #683

Merged
merged 26 commits into from
Nov 5, 2023
Merged

Conversation

irenedea
Copy link
Contributor

@irenedea irenedea commented Oct 19, 2023

Adds 'auto' packing ratio for finetuning.

If 'auto' is specified, packing will be profiled for the dataloader configuration across a maximum of 20 packing ratios until the highest packing ratio with 0 waste is found.

If there are multiple ranks, we take the minimum 'auto' packing ratio across all ranks.

Testing

  • Manual tests
    • 1 epoch on mosaicml/dolly_hhrlhf, auto packing had 1.3% waste overall
    • No packing run are the same before and after PR
      - before: finetune-auto-pack-test-baseline2-zptNpN , after: finetune-ratio-no-pack-branch-test-8N5I1y
    • Packing ratio of 3.0 is same before and after PR
      - before: finetune-ratio-3-pack-test-kW8gp6 , after: finetune-ratio-3-pack-with-auto-test-YVP00E
  • Unit tests
    • Tests packing on small tensors
    • Tests packing on small tensors with leftovers and waste calculation
    • Tests auto packing ratio selection for single and multi ranks
    • Tests auto packing with dataloader
    • Adds 'auto' packing_ratio parameterization to existing dataloader tests

@irenedea irenedea force-pushed the packing-collator branch 2 times, most recently from 206000d to 36dc1a0 Compare October 20, 2023 22:42
@irenedea irenedea requested review from dakinggg and alextrott16 and removed request for dakinggg October 21, 2023 19:14
@irenedea irenedea marked this pull request as ready for review October 21, 2023 19:15
Copy link
Contributor

@alextrott16 alextrott16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for adding this!!
Will hold off on approval so this can get another set of eyes!

Also left a couple comments (nothing major) along with a suggestion for an alternative way to search for the optimal packing ratio. I think re-using my brute force approach might not be the best way to go. If you agree, that will require more of an overhaul of this code, unfortunately. But potentially better to do that than simply inherit my hasty decision :)

llmfoundry/data/packing.py Outdated Show resolved Hide resolved
llmfoundry/data/packing.py Show resolved Hide resolved
scripts/train/finetune_example/mpt-7b-arc-easy--gpu.yaml Outdated Show resolved Hide resolved
tests/test_dataloader.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@dakinggg dakinggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

Two more manual tests I would like to see before merging:
(1) a short training run without packing, before and after this PR. These should be identical.
(2) a short training run with a set packing ratio, before and after this PR. These should be identical.

llmfoundry/data/denoising.py Show resolved Hide resolved
mcli/mcli-llama2-finetune.yaml Outdated Show resolved Hide resolved
scripts/train/finetune_example/mpt-7b-arc-easy--gpu.yaml Outdated Show resolved Hide resolved
tests/test_dataloader.py Outdated Show resolved Hide resolved
llmfoundry/data/finetuning/dataloader.py Show resolved Hide resolved
llmfoundry/data/packing.py Show resolved Hide resolved
llmfoundry/data/packing.py Outdated Show resolved Hide resolved
llmfoundry/data/packing.py Outdated Show resolved Hide resolved
llmfoundry/data/packing.py Show resolved Hide resolved
llmfoundry/data/packing.py Show resolved Hide resolved
Copy link
Collaborator

@dakinggg dakinggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for one determinism related comment

llmfoundry/data/packing.py Outdated Show resolved Hide resolved
llmfoundry/data/packing.py Show resolved Hide resolved
Co-authored-by: Daniel King <[email protected]>
Copy link
Collaborator

@dakinggg dakinggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with one more question relating to the randomness

llmfoundry/data/packing.py Show resolved Hide resolved
@irenedea irenedea merged commit ca8e6b5 into mosaicml:main Nov 5, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants