Skip to content

Commit

Permalink
warn about not pre-processing
Browse files Browse the repository at this point in the history
  • Loading branch information
winglian committed Jan 20, 2024
1 parent a787d57 commit 334f02c
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions src/axolotl/utils/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,10 @@ def drop_long_seq(sample, sequence_len=2048):


def process_datasets_for_packing(cfg, train_dataset, eval_dataset, tokenizer):
if cfg.is_preprocess:
LOG.warning(
"Processing datasets during training can lead to VRAM instability. Please pre-process your dataset"
)
drop_long = partial(drop_long_seq, sequence_len=cfg.sequence_len)
with zero_first(is_main_process()):
if cfg.group_by_length:
Expand Down

0 comments on commit 334f02c

Please sign in to comment.