Skip to content

Intel® Extension for PyTorch* v2.0.0+cpu Release Notes

Compare
Choose a tag to compare
@jingxu10 jingxu10 released this 22 Mar 21:48
· 34 commits to release/2.0 since this release
4a00387

We are pleased to announce the release of Intel® Extension for PyTorch* 2.0.0-cpu which accompanies PyTorch 2.0. This release mainly brings in our latest optimization on NLP, support of PyTorch 2.0's hero API –- torch.compile as one of its backend, together with a set of bug fixing and small optimization. We want to sincerely thank our dedicated community for your contributions. As always, we encourage you to try this release and feedback as to improve further on this product.

Highlights

  • Fast BERT optimization (Experimental): Intel introduced a new technique to speed up BERT workloads. Intel® Extension for PyTorch* integrated this implementation, which benefits BERT model especially training. A new API ipex.fast_bert is provided to try this new optimization. More detailed information can be found at Fast Bert Feature.

  • MHA optimization with Flash Attention: Intel optimized MHA module with Flash Attention technique as inspired by Stanford paper. This brings less memory consumption for LLM, and also provides better inference performance for models like BERT, Stable Diffusion, etc.

  • Work with torch.compile as an backend (Experimental): PyTorch 2.0 introduces a new feature, torch.compile, to speed up PyTorch execution. We've enabled Intel® Extension for PyTorch as a backend of torch.compile, which can leverage this new PyTorch API's power of graph capture and provide additional optimization based on these graphs.
    The usage of this new feature is quite simple as below:

import torch
import intel_extension_for_pytorch as ipex
...
model = ipex.optimize(model)
model = torch.compile(model, backend='ipex')
  • Bug fixing and other optimization

    • Supported RMSNorm which is widely used in the t5 model of huggingface #1341
    • Optimized InstanceNorm #1330
    • Fixed the quantization of LSTM #1414 #1473
    • Fixed the correctness issue of unpacking non-contiguous Linear weight #1419
    • oneDNN update #1488

Known Issues

Please check at Known Issues webpage.