Release Notes: Optimum v1.24.0

We’re excited to announce the release of Optimum v1.24.0. This update expands ONNX-based model capabilities and includes several improvements, bug fixes, and new contributions from the community.

🚀 New Features & Enhancements

ORTQuantizer now supports models with ONNX subfolders.
ONNX Runtime IO Binding support for all supported Transformers models (no models left behind).
SD3 and Flux model support added to ORTDiffusionPipeline enabling latest diffusion-based models.
Transformers v4.47 and v4.48 compatibility, ensuring seamless integration with the latest advancements in Hugging Face's ecosystem.
ONNX export support extended to various models, including Decision Transformer, ModernBERT, Megatron-BERT, Dinov2, OLMo, and many more (see details).

🔧 Key Fixes & Optimizations

Dropped support for Python 3.8
Bug fixes in ModelPatcher, SDXL refiner export, and device checks for improved reliability.

👥 New Contributors

A huge thank you to our first-time contributors:

Your contributions make Optimum better! 🎉

For a detailed list of all changes, please check out the full changelog.

🚀 Happy optimizing!

What's Changed

Onnx granite by @gabe-l-hart in #2043
Drop python 3.8 by @echarlaix in #2086
Update Dockerfile base image by @echarlaix in #2089
add transformers 4.36 tests by @echarlaix in #2085
[fix] Allow ORTQuantizer over models with subfolder ONNX files by @tomaarsen in #2094
SD3 and Flux support by @IlyasMoutawwakil in #2073
Remove datasets as required dependency by @echarlaix in #2087
Add ONNX Support for Decision Transformer Model by @ra9hur in #2038
Generate guidance for flux by @IlyasMoutawwakil in #2104
Unbundle inputs generated by DummyTimestepInputGenerator by @JingyaHuang in #2107
Pass the revision to SentenceTransformer models by @bndos in #2105
Rembert onnx support by @mlynatom in #2108
fix bug ModelPatcher returns empty outputs by @LoSealL in #2109
Fix workflow to mark issues as stale by @echarlaix in #2110
Remove doc-build by @echarlaix in #2111
Downgrade stale bot to v8 and fix permissions by @echarlaix in #2112
Update documentation color from google tpu section by @echarlaix in #2113
Fix workflow to mark PRs as stale by @echarlaix in #2116
Enable transformers v4.47 support by @echarlaix in #2119
Add ONNX export support for MGP-STR by @xenova in #2099
Add ONNX export support for OLMo and OLMo2 by @xenova in #2121
Pass on model_kwargs when exporting a SentenceTransformers model by @sjrl in #2126
Add ONNX export support for DinoV2, Hiera, Maskformer, PVT, SigLIP, SwinV2, VitMAE, and VitMSN models by @xenova in #2001
move check_dummy_inputs_allowed to common export utils by @eaidova in #2114
Remove CI macos runners by @echarlaix in #2129
Enable GPTQModel by @jiqing-feng in #2064
Skip private model loading for external contributors by @echarlaix in #2130
fix sdxl refiner export by @eaidova in #2133
Export to ExecuTorch: Initial Integration by @guangy10 in #2090
Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM by @LRL-ModelCloud in #2146
Update docker files by @echarlaix in #2102
Limit diffusers version by @IlyasMoutawwakil in #2150
Add ONNX export support for ModernBERT by @xenova in #2131
Allow GPTQModel to auto select Marlin or faster kernels for inference only ops by @LRL-ModelCloud in #2138
fix device check by @jiqing-feng in #2136
Replace check_if_xxx_greater with is_xxx_version by @echarlaix in #2152
Add tf available and version by @echarlaix in #2154
Add ONNX export support for PatchTST by @xenova in #2101
fix infer task from model_name if model from sentence transformer by @eaidova in #2151
Unpin diffusers and pass onnx exporters tests by @IlyasMoutawwakil in #2153
Uncomment modernbert config by @IlyasMoutawwakil in #2155
Skip optimum-benchmark when loading namespace modules by @IlyasMoutawwakil in #2159
Fix PR doc upload by @regisss in #2161
Move executorch to optimum-executorch by @echarlaix in #2165
Adding Onnx Support For Megatron-Bert by @pragyandev in #2169
Transformers 4.48 by @IlyasMoutawwakil in #2158
Update ort CIs (slow, gpu, train) by @IlyasMoutawwakil in #2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.24.0: SD3 & Flux, DinoV2, Modernbert, GPTQModel, Transformers v4.48...

Release Notes: Optimum v1.24.0

🚀 New Features & Enhancements

🔧 Key Fixes & Optimizations

👥 New Contributors

What's Changed

Contributors