Skip to content

v0.0.22: Mixtral support, pipeline for sentence transformers, compatibility with Compel

Compare
Choose a tag to compare
@JingyaHuang JingyaHuang released this 07 May 16:51
· 88 commits to main since this release

What's Changed

Training

Inference

TGI

  • Set up TGI environment values with the ones used to build the model by @oOraph in #529
  • TGI benchmark with llmperf by @dacorvo in #564
  • Improve tgi env wrapper for neuron by @oOraph in #589

Caveat

Currently traced models with inline_weights_to_neff=False have higher than expected latency during the inference. This is due to the weights are not automatically moved to Neuron devices. The issue will be fixed in #584, please avoid setting inline_weights_to_neff=False in this release.

Other changes

New Contributors

Full Changelog: v0.0.21...v0.0.22