Merge #8

seungduk-yanolja · 2024-04-10T08:55:30Z

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

* add lisa support * fix default and fix attribute traversal for layers * improve lisa callback logging * fix LISA by ensuring params are not frozen during __init__ * example config for lisa --------- Co-authored-by: Aman Karmani <[email protected]>

* feat: add deepspeed 3 with cpuoffload * make bf16 explicit, add param only offload variant --------- Co-authored-by: Wing Lian <[email protected]>

* can configure name of split of pretraining dataset * streaming data and dataset map * text column customized * allow text_column to be set in pretrain * pretrain type * load a bit of the dataset * fix dataset where splits have separate configs * ok name param here is the config * whitespace

…lotl-ai-cloud#1461) * Added pip install ninja to accelerate installation of flash-attn * doc: cleanup

* feat: update doc contents * chore: move batch vs ga docs * feat: update lambdalabs instructions * fix: refactor dev instructions

…oud#1465) * feat: validate sample packing requires flash_attention * fix: check for sdp_attn per suggestion * feat: add FA to tests

… case of qwen) (axolotl-ai-cloud#1490)

DoRA with quantized layers is supported with PEFT 0.10.0

…ud#1493)

…cloud#1497)

…) [skip ci] It should be `qlora` instead of `lora`

* print out dependency versions for easier debugging * improve readability

…xolotl-ai-cloud#1504) * Correctly handle splits for datasets.arrow_dataset.Dataset objects The `load_tokenized_prepared_datasets` function currently has logic for loading a dataset from local path that always checks if a split is in the dataset. The problem is, if the dataset is loaded using `load_from_disk` and it is an Arrow-based dataset, *there is no* split information. Instead what happens is, by calling `split in ds`, it presumably searches through all the rows and columns of the arrow dataset object to find e.g., 'train' assuming `split == 'train'`. This causes the program to hang. See https://chat.openai.com/share/0d567dbd-d60b-4079-9040-e1de58a4dff3 for context. * chore: lint --------- Co-authored-by: Wing Lian <[email protected]>

* WIP: Support table logging for mlflow, too Create a `LogPredictionCallback` for both "wandb" and "mlflow" if specified. In `log_prediction_callback_factory`, create a generic table and make it specific only if the newly added `logger` argument is set to "wandb" resp. "mlflow". See axolotl-ai-cloud#1505 * chore: lint * add additional clause for mlflow as it's optional * Fix circular imports --------- Co-authored-by: Dave Farago <[email protected]> Co-authored-by: Wing Lian <[email protected]>

…otl-ai-cloud#1503)

…ve (axolotl-ai-cloud#1483) * deprecated wandb.save * also use wandb.save for axolotl yaml * chore: lint --------- Co-authored-by: Wing Lian <[email protected]>

winglian and others added 25 commits April 1, 2024 04:54

LISA (axolotl-ai-cloud#1469)

0ddfb24

* add lisa support * fix default and fix attribute traversal for layers * improve lisa callback logging * fix LISA by ensuring params are not frozen during __init__ * example config for lisa --------- Co-authored-by: Aman Karmani <[email protected]>

feat: add deepspeed 3 with cpuoffload (axolotl-ai-cloud#1466)

946b497

* feat: add deepspeed 3 with cpuoffload * make bf16 explicit, add param only offload variant --------- Co-authored-by: Wing Lian <[email protected]>

reduce verbosity of the special tokens (axolotl-ai-cloud#1472)

0b10377

Reorganize Docs (axolotl-ai-cloud#1468)

86b7d22

Added pip install ninja to accelerate installation of flash-attn (axo…

cae608f

…lotl-ai-cloud#1461) * Added pip install ninja to accelerate installation of flash-attn * doc: cleanup

Pretrain multipack v2 (axolotl-ai-cloud#1470)

5aa5097

fix toc

5760099

Feat: update doc (axolotl-ai-cloud#1475) [skip ci]

c2b64e4

* feat: update doc contents * chore: move batch vs ga docs * feat: update lambdalabs instructions * fix: refactor dev instructions

refactor utils.data module for line count linter (axolotl-ai-cloud#1476)

e0fcef4

don't use deepspeed or fsdp when merging loras (axolotl-ai-cloud#1479)

87ca3f9

add support for cohere chat template (axolotl-ai-cloud#1478)

05b0b7e

feat: validate sample packing requires flash_attention (axolotl-ai-cl…

bf4cd67

…oud#1465) * feat: validate sample packing requires flash_attention * fix: check for sdp_attn per suggestion * feat: add FA to tests

fix: reduce sample_packing warning (axolotl-ai-cloud#1484)

bda48f0

drop empty token from beginning if tokenizer has no bos_token (in the…

934fc85

… case of qwen) (axolotl-ai-cloud#1490)

Remove validate_quantized_dora (axolotl-ai-cloud#1485)

9430b6e

DoRA with quantized layers is supported with PEFT 0.10.0

ignore issues with calculating # params when printing (axolotl-ai-clo…

2fa65b9

…ud#1493)

add field to sft dataset pydantic for completion support (axolotl-ai-…

ff01c45

…cloud#1497)

Fix the wrong adapter in qwen2-moe-qlora example (axolotl-ai-cloud#1501…

7f17eff

…) [skip ci] It should be `qlora` instead of `lora`

Print versions (axolotl-ai-cloud#1496)

4313b1a

* print out dependency versions for easier debugging * improve readability

use locale agnostic seperator to make large nums easier to read (axol…

da9b1a3

…otl-ai-cloud#1503)

Update SaveAxolotlConfigtoWandBCallback to use artifact instead of sa…

5ed2939

…ve (axolotl-ai-cloud#1483) * deprecated wandb.save * also use wandb.save for axolotl yaml * chore: lint --------- Co-authored-by: Wing Lian <[email protected]>

Merge remote-tracking branch 'upstream/main' into merge

ab596fb

seungduk-yanolja merged commit 7ea80ff into main Apr 10, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge #8

Merge #8

seungduk-yanolja commented Apr 10, 2024

Merge #8

Merge #8

Conversation

seungduk-yanolja commented Apr 10, 2024

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)