Release Notes: Optimum v1.24.0
We’re excited to announce the release of Optimum v1.24.0. This update expands ONNX-based model capabilities and includes several improvements, bug fixes, and new contributions from the community.
🚀 New Features & Enhancements
ORTQuantizer
now supports models with ONNX subfolders.- ONNX Runtime IO Binding support for all supported Transformers models (no models left behind).
- SD3 and Flux model support added to
ORTDiffusionPipeline
enabling latest diffusion-based models. - Transformers v4.47 and v4.48 compatibility, ensuring seamless integration with the latest advancements in Hugging Face's ecosystem.
- ONNX export support extended to various models, including Decision Transformer, ModernBERT, Megatron-BERT, Dinov2, OLMo, and many more (see details).
🔧 Key Fixes & Optimizations
- Dropped support for Python 3.8
- Bug fixes in
ModelPatcher
, SDXL refiner export, and device checks for improved reliability.
👥 New Contributors
A huge thank you to our first-time contributors:
Your contributions make Optimum better! 🎉
For a detailed list of all changes, please check out the full changelog.
🚀 Happy optimizing!
What's Changed
- Onnx granite by @gabe-l-hart in #2043
- Drop python 3.8 by @echarlaix in #2086
- Update Dockerfile base image by @echarlaix in #2089
- add transformers 4.36 tests by @echarlaix in #2085
- [
fix
] Allow ORTQuantizer over models with subfolder ONNX files by @tomaarsen in #2094 - SD3 and Flux support by @IlyasMoutawwakil in #2073
- Remove datasets as required dependency by @echarlaix in #2087
- Add ONNX Support for Decision Transformer Model by @ra9hur in #2038
- Generate guidance for flux by @IlyasMoutawwakil in #2104
- Unbundle inputs generated by
DummyTimestepInputGenerator
by @JingyaHuang in #2107 - Pass the revision to SentenceTransformer models by @bndos in #2105
- Rembert onnx support by @mlynatom in #2108
- fix bug
ModelPatcher
returns empty outputs by @LoSealL in #2109 - Fix workflow to mark issues as stale by @echarlaix in #2110
- Remove doc-build by @echarlaix in #2111
- Downgrade stale bot to v8 and fix permissions by @echarlaix in #2112
- Update documentation color from google tpu section by @echarlaix in #2113
- Fix workflow to mark PRs as stale by @echarlaix in #2116
- Enable transformers v4.47 support by @echarlaix in #2119
- Add ONNX export support for MGP-STR by @xenova in #2099
- Add ONNX export support for OLMo and OLMo2 by @xenova in #2121
- Pass on
model_kwargs
when exporting a SentenceTransformers model by @sjrl in #2126 - Add ONNX export support for DinoV2, Hiera, Maskformer, PVT, SigLIP, SwinV2, VitMAE, and VitMSN models by @xenova in #2001
- move check_dummy_inputs_allowed to common export utils by @eaidova in #2114
- Remove CI macos runners by @echarlaix in #2129
- Enable GPTQModel by @jiqing-feng in #2064
- Skip private model loading for external contributors by @echarlaix in #2130
- fix sdxl refiner export by @eaidova in #2133
- Export to ExecuTorch: Initial Integration by @guangy10 in #2090
- Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM by @LRL-ModelCloud in #2146
- Update docker files by @echarlaix in #2102
- Limit diffusers version by @IlyasMoutawwakil in #2150
- Add ONNX export support for ModernBERT by @xenova in #2131
- Allow GPTQModel to auto select Marlin or faster kernels for inference only ops by @LRL-ModelCloud in #2138
- fix device check by @jiqing-feng in #2136
- Replace check_if_xxx_greater with is_xxx_version by @echarlaix in #2152
- Add tf available and version by @echarlaix in #2154
- Add ONNX export support for
PatchTST
by @xenova in #2101 - fix infer task from model_name if model from sentence transformer by @eaidova in #2151
- Unpin diffusers and pass onnx exporters tests by @IlyasMoutawwakil in #2153
- Uncomment modernbert config by @IlyasMoutawwakil in #2155
- Skip optimum-benchmark when loading namespace modules by @IlyasMoutawwakil in #2159
- Fix PR doc upload by @regisss in #2161
- Move executorch to optimum-executorch by @echarlaix in #2165
- Adding Onnx Support For Megatron-Bert by @pragyandev in #2169
- Transformers 4.48 by @IlyasMoutawwakil in #2158
- Update ort CIs (slow, gpu, train) by @IlyasMoutawwakil in #2024