Pydantic 2.x cfg #1239

winglian · 2024-02-01T04:03:22Z

Description

This PR migrates the validation to use Pydantic validators. We keep all of the existing tests with some modification. I've attempted to capture all the cfg.* attributes I could find and make sure they are represented in the Pydantic config models. I've also added a GPUCapabilities model to abstract away the underlying hardware checks so that it can be checked offline before sending the config for actual training.

For now, we run the validation and convert it back to a DictDefault that we currently use for the cfg. This is to minimize the blast radius of this change to strictly validation. We can consider down the line how to swap out the various uses of cfg and how the attributes are accessed and if it's compatible with pydantic models.

Motivation and Context

How has this been tested?

Existing unit tests.

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

NanoCode012

This is my first review. I'll take a look later on the rest.

src/axolotl/utils/config/models/internals/__init__.py

NanoCode012 · 2024-02-21T11:21:13Z

src/axolotl/cli/__init__.py

+    capabilities = GPUCapabilities(
+        bf16=is_torch_bf16_gpu_available(), n_gpu=os.environ.get("WORLD_SIZE", 1)
+    )


Should this be moved into validate or normalize config?

I'm avoiding setting this in the validation as there are downstream use cases where a user might want to make sure their configuration works on a configured GPU cluster before firing off the training.

src/axolotl/utils/config/models/input/v0_4_1/__init__.py

NanoCode012 · 2024-02-21T11:33:55Z

src/axolotl/utils/config/models/input/v0_4_1/__init__.py

+    @model_validator(mode="before")
+    @classmethod
+    def check_sample_packing_w_xformers(cls, root):
+        if root.get("sample_packing") and root.get("xformers_attention"):


Can we have a few methods check for when sample_packing is on but not actually active due to unsupported model type?

Pseudo code

def check_sample_packing_active(cls, root): if root.get("sample_packing") and not any(llama/mistral/qwen): raise ValueError( "sample_packing not compatible with current model type" )

I don't know that we can reliably detect the model at this step of the validation to raise a ValueError

src/axolotl/utils/config/models/input/v0_4_1/__init__.py

mattt

Great stuff. Thanks for taking this on, @winglian! Left a couple non-blocking suggestions.

In a follow-up, it'd be lovely to annotate each field with the documentation from "All yaml options (click me)" in the README.

NanoCode012 · 2024-02-23T03:45:55Z

README.md

@@ -543,7 +543,7 @@ is_mistral_derived_model:
 is_qwen_derived_model:

 # optional overrides to the base model configuration
-model_config:
+model_config_overrides:


I just noticed this, but we would need to deprecate the old name (add valueerror)

we can't actually deprecate it with Pydantic because model_config is an internal variable name for pydantic models

remember to return in validator missing return add missing relora attributes fix test for DictDefault change fix sys template for mistral from fastchat change in PR 2872 fix test for batch size warning

* WIP conversion to use pydantic for config validation * wip, more fields, add capabilities * wip * update pydantic validation to match existing tests * tweak requirements * setup deprecated paams pydantic model * more validations * wrap up rest of the validations * flesh out the rest of the options from the readme into pydantic * fix model validators as class methods remember to return in validator missing return add missing relora attributes fix test for DictDefault change fix sys template for mistral from fastchat change in PR 2872 fix test for batch size warning * more missing attributes for cfg * updates from PR feedback * fix validation for datasets and pretrain datasets * fix test for lora check

winglian marked this pull request as draft February 1, 2024 04:03

winglian force-pushed the pydantic-cfg branch from 076d50a to d464d1b Compare February 19, 2024 15:30

winglian changed the title ~~WIP: Pydantic cfg~~ Pydantic 2.x cfg Feb 20, 2024

winglian marked this pull request as ready for review February 20, 2024 21:41

winglian force-pushed the pydantic-cfg branch from 3b032b8 to 50199ea Compare February 20, 2024 22:45

winglian requested review from NanoCode012 and mhenrichsen February 20, 2024 23:55

winglian mentioned this pull request Feb 21, 2024

fix(examples): remove is_*_derived as it's parsed automatically #1297

Merged

NanoCode012 reviewed Feb 21, 2024

View reviewed changes

winglian force-pushed the pydantic-cfg branch from ea381f0 to 032eced Compare February 22, 2024 00:25

mattt reviewed Feb 22, 2024

View reviewed changes

src/axolotl/utils/config/models/input/v0_4_1/__init__.py Outdated Show resolved Hide resolved

mattt reviewed Feb 22, 2024

View reviewed changes

NanoCode012 reviewed Feb 23, 2024

View reviewed changes

winglian added 14 commits February 26, 2024 11:44

WIP conversion to use pydantic for config validation

08768b9

wip, more fields, add capabilities

9e14055

wip

135e190

update pydantic validation to match existing tests

eec928d

tweak requirements

b7cf4df

setup deprecated paams pydantic model

3a99d1e

more validations

13ec0b7

wrap up rest of the validations

eb7f595

flesh out the rest of the options from the readme into pydantic

402991a

fix model validators as class methods

bfb0bb7

remember to return in validator missing return add missing relora attributes fix test for DictDefault change fix sys template for mistral from fastchat change in PR 2872 fix test for batch size warning

more missing attributes for cfg

8b18843

updates from PR feedback

c1dd72e

fix validation for datasets and pretrain datasets

0d90a8d

fix test for lora check

7f688b6

winglian force-pushed the pydantic-cfg branch from d33a8f7 to 7f688b6 Compare February 26, 2024 16:44

winglian merged commit cc3cebf into main Feb 26, 2024
7 checks passed

winglian deleted the pydantic-cfg branch February 26, 2024 17:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pydantic 2.x cfg #1239

Pydantic 2.x cfg #1239

winglian commented Feb 1, 2024 •

edited

Loading

NanoCode012 left a comment

NanoCode012 Feb 21, 2024

winglian Feb 21, 2024

NanoCode012 Feb 21, 2024

winglian Feb 21, 2024

mattt left a comment

NanoCode012 Feb 23, 2024

winglian Feb 26, 2024

Pydantic 2.x cfg #1239

Pydantic 2.x cfg #1239

Conversation

winglian commented Feb 1, 2024 • edited Loading

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

NanoCode012 left a comment

Choose a reason for hiding this comment

NanoCode012 Feb 21, 2024

Choose a reason for hiding this comment

winglian Feb 21, 2024

Choose a reason for hiding this comment

NanoCode012 Feb 21, 2024

Choose a reason for hiding this comment

winglian Feb 21, 2024

Choose a reason for hiding this comment

mattt left a comment

Choose a reason for hiding this comment

NanoCode012 Feb 23, 2024

Choose a reason for hiding this comment

winglian Feb 26, 2024

Choose a reason for hiding this comment

winglian commented Feb 1, 2024 •

edited

Loading