Skip to content

Commit

Permalink
Merge branch 'main' into fix-mpt
Browse files Browse the repository at this point in the history
  • Loading branch information
irenedea authored Oct 7, 2023
2 parents 56b82cc + 7fb084a commit 34b75e5
Show file tree
Hide file tree
Showing 5 changed files with 310 additions and 240 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/docker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ jobs:
base_image: mosaicml/pytorch:1.13.1_cu117-python3.10-ubuntu20.04
- name: '2.0.1_cu118'
base_image: mosaicml/pytorch:2.0.1_cu118-python3.10-ubuntu20.04
- name: '2.1.0_cu121'
base_image: mosaicml/pytorch:2.1.0_cu121-python3.10-ubuntu20.04

steps:
- name: Maximize Build Space on Worker
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/pr-cpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,10 @@ jobs:
container: mosaicml/pytorch:2.0.1_cpu-python3.10-ubuntu20.04
markers: 'not gpu'
pytest_command: 'coverage run -m pytest'
- name: 'cpu-2.1.0'
container: mosaicml/pytorch:2.1.0_cpu-python3.10-ubuntu20.04
markers: 'not gpu'
pytest_command: 'coverage run -m pytest'
name: ${{ matrix.name }}
if: github.repository_owner == 'mosaicml'
with:
Expand Down
6 changes: 5 additions & 1 deletion .github/workflows/pr-gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,11 @@ jobs:
markers: 'gpu'
pytest_command: 'coverage run -m pytest'
- name: 'gpu-2.0.1'
container: mosaicml/pytorch:2.0.1_cu117-python3.10-ubuntu20.04
container: mosaicml/pytorch:2.0.1_cu118-python3.10-ubuntu20.04
markers: 'gpu'
pytest_command: 'coverage run -m pytest'
- name: 'gpu-2.1.0'
container: mosaicml/pytorch:2.1.0_cu121-python3.10-ubuntu20.04
markers: 'gpu'
pytest_command: 'coverage run -m pytest'
name: ${{ matrix.name }}
Expand Down
9 changes: 8 additions & 1 deletion llmfoundry/optim/lion8b.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from typing import Any, Callable, Dict, Iterable, Optional, Tuple

import torch
from packaging import version


class DecoupledLionW_8bit(torch.optim.Optimizer):
Expand Down Expand Up @@ -53,7 +54,7 @@ class DecoupledLionW_8bit(torch.optim.Optimizer):
by retaining information across optimizer steps.
Raises:
NotImplemenetedError - If any of `quantize`, `compress_state_dict`,
NotImplementedError - If any of `quantize`, `compress_state_dict`,
or `error_correction` are `True` and either a) there is no CUDA
device, or b) step() is executed on a non-CUDA parameter.
"""
Expand All @@ -67,6 +68,12 @@ def __init__(self,
compress_state_dict: bool = False,
error_correction: bool = False,
_fused: bool = True): # XXX this flag is mostly for testing...
if version.parse(torch.__version__) >= version.parse(
'2.1.0') and error_correction:
raise RuntimeError(
'DecoupledLionW_8bit with error correction requires PyTorch <2.1.0'
)

if lr < 0.0:
raise ValueError('Invalid learning rate: {}'.format(lr))
if not 0.0 <= betas[0] <= 1.0:
Expand Down
Loading

0 comments on commit 34b75e5

Please sign in to comment.