Skip to content

v0.26.0 - MS-AMP Support, Critical Regression Fixes, and More

Compare
Choose a tag to compare
@muellerzr muellerzr released this 11 Jan 14:55
· 405 commits to main since this release

Support for MS-AMP

This release adds support for the MS-AMP (Microsoft Automatic Mixed Precision Library) into Accelerate as an alternative backend for doing FP8 training on appropriate hardware. It is the default backend of choice. Read more in the docs here. Introduced in #2232 by @muellerzr

Core

In the prior release a new sampler for the DataLoader was introduced that while across seeds does not show statistical differences in the results, repeating the same seed would result in a different end-accuracy that was scary to some users. We have now disabled this behavior by default as it required some additional setup, and brought back the original implementation. To have the new sampling technique (which can provide more accurate repeated results) pass use_seedable_sampler=True to the Accelerator. We will be propagating this up to the Trainer soon.

Big Model Inference

  • NPU support was added thanks to @statelesshz in #2222
  • When generating an automatic device_map we've made it possible to not returned grouped key results if desired in #2233
  • We now handle corner cases better when users pass device_map="cuda" etc thanks to @younesbelkada in #2254

FSDP and DeepSpeed

  • Many improvements to the docs have been made thanks to @stass. Along with this we've made it easier to adjust the config for the sharding strategy and other config values thanks to @pacman100 in #2288

  • A regression in Accelerate 0.23.0 occurred that showed learning is much slower on multi-GPU setups compared to a single GPU. #2304 has now fixed this thanks to @pacman100

  • The DeepSpeed integration now also handles auto values better when making a configuration in #2313

Bits and Bytes

  • Params4bit added to bnb classes in set_module_tensor_to_device() by @poedator in #2315

Device Agnostic Testing

For developers, we've made it much easier to run the tests on different devices with no change to the code thanks to @statelesshz in #2123 and #2235

Bug Fixes

Major Contributors

  • @statelesshz for their work on device-agnostic testing and NPU support
  • @stas00 for many docfixes when it comes to DeepSpeed and FSDP

General Changelog

New Contributors

Full Changelog: v0.25.0...v0.26.0