Releases: foundation-model-stack/fms-acceleration
Releases · foundation-model-stack/fms-acceleration
v0.5.0.1
Release
foak
v0.4.1: patch version. fixes for CI builds
What's Changed
- Decouple Filter MP Rules function from cuda imports by @fabianlim in #117
Full Changelog: v0.5.0...v0.5.0.1
New Mixture-of-Experts Plugin
Release
framework
v0.5.0: minor version. Updated to manage new moe plugin.moe
v0.1.0: new plugin. Released mixture-of-experts plugin with ScatterMoE kernels.peft
v0.3.5: patch version. Fixed Autocast warnings (#113).foak
v0.4.0: minor version. Add support for Liger fused-ce (#93), fixes for Fused Ops (dropout and activation).
What's Changed
- Fix Dropout in Fused LoRA Operations by @fabianlim in #102
- Add ExpertParallel Mixture-of-Experts Plugin by @fabianlim in #99
- Disable MLP Fused Ops if Not SwiGLU, Depracted Fast Quantized Peft Plugin, Update Benchmarks by @fabianlim in #106
- fix: requirements file path in error by @willmj in #111
- fix: Deprecation Warnings in AutoCast API by @Abhishek-TAMU in #113
- feat: add liger kernel with fused cross entropy loss by @anhuong in #93
- feat: Checkpoint utils safetensors by @willmj in #116
New Contributors
- @Abhishek-TAMU made their first contribution in #113
- @anhuong made their first contribution in #93
Full Changelog: v0.4.0.4...v0.4.0.5
v0.4.0.4
Release
peft
v0.3.4: patch version. Address #90 for AutoGPTQ when certain parameters require resizing.foak
v0.3.3: patch version. Address bug introduced in #90 where the grad accum hooks were overwritten.
What's Changed
- Fix Issue with Resizing Parameters on the Meta Device in Low CPU Mem Mode by @fabianlim in #96
- model: Add granite GPTQ model by @willmj in #95
New Contributors
Full Changelog: v0.4.0.3...v0.4.0.4
Quickfix: Properly Apply Retie Weights Fix for AutoGPTQ
Release
peft
v0.3.3: patch version. Properly fix #90 for AutoGPTQ
What's Changed
- Apply Retie Weights Fix Regardless of Transformers and TRL version for AutoGPTQ by @fabianlim in #94
Full Changelog: v0.4.0.2...v0.4.0.3
v0.4.0.2
Release
peft
v0.3.2: patch version. updatedaccelerate.yaml
for v1. address all low CPU mem issues for quant models.foak
v0.3.2: patch version: updated datatype support matrix. fix error introduced in #86. address all low CPU mem issues for quant models.
What's Changed
- Quickfix: Accelerate YAML and LoRA Fused Ops by @fabianlim in #92
- Fix Low CPU Memory Mode Issues for Quantized Peft by @fabianlim in #90
Full Changelog: v0.4.0.1...v0.4.0.2
Benchmarks for PaddingFree and Granite. Fix for LowCPUMemMode for Quant.
Release
aadp
no version bump: Updates on PaddingFree bench only.peft
v0.3.1: patch version. fixes forlow_cpu_mem_mode
for issues introduced since transformers0.45
. Also provide fallback iftarget_modules=None
.foak
v0.3.1: patch version: Support forbias
, needed for Granite models.
What's Changed
- Update Benches: Orca by @fabianlim in #85
- Update Benchmarks and Documentation for GraniteCausalLM by @fabianlim in #86
- Fixes to Accelerated Peft by @fabianlim in #89
Full Changelog: v0.4.0...v0.4.1
v0.4.0
Release
framework
v0.4, minor version:ModelPatcher
now allows multiple reload targets that point to the same file.aadp
v0.1.1, patch: Fix onflash_attn_forward
patching for "transformers < 0.44".peft
v0.3.0: minor version. very minor fixesfoak
v0.3.0: minor version: provideFastKernelsAccelerationPlugin
that supercedesFastQuantizedPeftAccelerationPlugin
. This new plugin also works for full-FT as well as regular peft.
What's Changed
- Allow Kernels for Full FT and Non-Quantized PEFT by @fabianlim in #79
Full Changelog: v0.3.0.1...v0.4.0
Patch Fix: Wrong Assertion in Accelerated Peft
Release
peft v0.2.1: fix wrong assertion on target modules in peft_config
What's Changed
Full Changelog: v0.3.0...v0.3.0.1
Acceleration Patcher, new AttentionAndDistributedPacking Plugin (previously ilab), Benchmarking Fixes
Release
framework
v0.3, minor version:Acceleration Patcher
now provided in framework.aadp
v0.1, new_plugin: Replacement of theilab
pluginpeft
v0.2.0: minor version bump. supportall-linear
intarget_modules
foak
v0.2.1: patch release: formatting fixes
What's Changed
- Rectify Missing Dataloader Preparation Call in PaddingFree Plugin Method by @achew010 in #63
- Rename Plugin to
AttentionAndDistributedPacking
by @achew010 in #64 - Add Benchmarking Compatibility to PaddingFree Plugin by @achew010 in #66
- Benchmarking: Add Response Field to Use Chat Templates Without Response Template by @fabianlim in #68
- Add Acceleration Patcher and MultiPack Plugin by @fabianlim in #67
- Fix formatter by @achew010 in #74
- Allow PaddingFree to work with DataCollatorForCompletionOnlyLM by @fabianlim in #78
- fixed bug in peft installation for gptqmodel by @achew010 in #81
Full Changelog: v0.2.0...v0.3.0
Model Patcher moved to Framework, Instruct Lab Plugin
Release
framework
v0.2, minor version:ModelPatcher
now moved into framework. Also now bench supports pretokenized datasetsilab
v0.1, new_plugin: New plugin with padding free support (native after transformers 4.44peft
v0.1.1: Patch bump. Minor changes.foak
v0.2: Minor patch bump withModelPatcher
moved out from it.
Update: we decided to remove the ilab
plugin and replace it with an attention-and-distributed-packing
plugin in the upcoming releases
What's Changed
- Refactored Model Patcher Class by @achew010 in #55
- Address Package Bound and Triton Issues for Torch 2.4 by @fabianlim in #58
- Introduce Padding-Free Plugin to FMS-Acceleration by @achew010 in #57
- Allow Bench To Configure Data Processing Pipeline Per Scenario by @fabianlim in #60
- Fix Mistakes with FA Padding Free by @fabianlim in #62
- Additional README Changes for PR #57 by @achew010 in #61
Full Changelog: v0.1.2.0...v0.2.0