Releases: tenstorrent/tt-buda
v0.19.3
Summary
Changelog for Release v0.19.3 (notable changes since v0.18.2).
Generality
- Implemented support for
10
additional model variants, including3
new architectures:Phi2
Qwen1.5-0.5B
YOLOX
Performance
Grayskull (e75, e150)
- Enhanced performance observed on e75 include HRNet (41%)
- Notable performance regressions on e150 include OpenPose (-16%)
Wormhole (n150)
- Performance restored on Whisper, Stable Diffusion
Wormhole (n300)
- Generative AI models not supported on n300 (dual-chip)
- Enhanced performance observed on n300 (dual-chip) of 8% average
- Enhanced performance observed on n300 (single-chip) include Falcon-7B (14%)
- Notable performance regressions on n300 (single-chip) include FLAN-T5 (-20%), U-Net (-14%)
- Accuracy regressions on n300 (dual-chip) include YOLOv5, OpenPose, U-Net
Multicard Systems (TT-LoudBox / TT-QuietBox)
- Enabled support for CNN models on 4-chip MMIO and 8-chip configurations
- Generative AI models not supported
General (Improvements/Features/Etc.)
- Introduced functionality for Wormhole n300 (dual-chip), including the TT-LoudBox and TT-QuietBox configurations
- Across-the-board bug fixes and enhancements to compiler stability
- Enhanced customer-facing documentation for improved clarity and accessibility
Known issues
- N/A
Compatibility matrix
OS | Python | PyTorch | Driver | Firmware |
---|---|---|---|---|
Ubuntu 22.04 | 3.10.12 | 2.1.0+cpu.cxx11.abi | ttkmd_1.29 | fw_v80.10.0.0 |
Ubuntu 20.04 | 3.8.10 | 2.1.0+cpu.cxx11.abi | ttkmd_1.29 | fw_v80.10.0.0 |
What's Changed
- Uplift PyBuda changes (week29) by @vmilosevic in #39
Full Changelog: v0.18.2...v0.19.3
v0.18.2
Summary
Changelog for Release v0.18.2 (notable changes since v0.12.3).
Generality
- Implemented support for
5
additional model variants, including2
new architectures:SSD300 ResNet50
YOLOv6
with the following variants:- yolov6n
- yolov6s
- yolov6m
- yolov6l
Performance
Grayskull (e75, e150)
- Enhanced performance observed in DeiT (11%)
- Notable performance regressions on e75 include HRNet (-27%), MobileNetV3 (-18%), ResNet (-11%)
- Regression in Whisper (to-be patched in future release)
Wormhole (n150)
- Achieved broad performance improvements averaging 6%
- Significantly enhanced performance observed in MobileNetV2 (46%), YOLOv5 (20%), VoVNet (19%)
- Notable performance regressions on include Whisper (-78%), Stable Diffusion (-50%) (to-be patched in future release)
Wormhole (n300)
- Introduced performance benchmarks for the n300 (single-chip)
- n300 dual-chip and multi-card data parallel performance to be included in next release
General (Improvements/Features/Etc.)
- Introduced functionality for Wormhole n300, including the LoudBox and QuietBox configurations
- Limited to single-chip and 4x chip functionality, with dual-chip and 8x chip ethernet support planned for next release
- Across-the-board bug fixes and enhancements to compiler stability
- Enhanced customer-facing documentation for improved clarity and accessibility
Known issues
- Models running on 1x1 grid size might face issues on Intel CPUs
- Regression on Whisper model utilizing
enc_dec
implementation - Ethernet issues leading to hangs on n300 dual-chip data parallel mode
Compatibility matrix
OS | Python | PyTorch | Driver | Firmware |
---|---|---|---|---|
Ubuntu 22.04 | 3.10.12 | 2.1.0+cpu.cxx11.abi | ttkmd_1.28 | fw_v80.10.0.0 |
Ubuntu 20.04 | 3.8.10 | 2.1.0+cpu.cxx11.abi | ttkmd_1.28 | fw_v80.10.0.0 |
What's Changed
- Fix typos. by @jasondavies in #25
- Fix user_guide.rst doc python example issues by @jaebaek in #27
- Mistype fix by @gritukan in #30
- Uplift 2024-06-27 by @vmilosevic in #34
New Contributors
- @jasondavies made their first contribution in #25
- @jaebaek made their first contribution in #27
- @gritukan made their first contribution in #30
Full Changelog: v0.12.3...v0.18.2
v0.17.0-alpha
Summary
Changelog for Alpha Release v0.17.0-alpha (notable changes since *v0.15.0-alpha).
Changelog
Generality
- Implemented support for
23
additional model variants, including1
new architectures for2
frameworks:ONNX
:DLA
:- dla34
- dla46_c
- dla46x_c
- dla60x_c
- dla60
- dla60x
- dla102
- dla102x
- dla102x2
- dla169
PyTorch
SegFormer
(Image Classification):- nvidia/mit-b0
- nvidia/mit-b1
- nvidia/mit-b2
- nvidia/mit-b3
- nvidia/mit-b4
- nvidia/mit-b5
SegFormer
(Semantic Segmentation):- nvidia/segformer-b0-finetuned-ade-512-512
- nvidia/segformer-b1-finetuned-ade-512-512
- nvidia/segformer-b2-finetuned-ade-512-512
- nvidia/segformer-b3-finetuned-ade-512-512
- nvidia/segformer-b4-finetuned-ade-512-512
PerceiverIO
:- deepmind/vision-perceiver-fourier
- deepmind/vision-perceiver-learned
General (Improvements/Features/Etc.)
- Across-the-board bug fixes and enhancements to compiler stability
Notes
Please be aware that this is an Alpha release, and while we’ve worked diligently, stability across all models and features cannot be guaranteed. Thank you for your understanding.
Compatibility matrix
OS | Python | PyTorch | Driver | Firmware |
---|---|---|---|---|
Ubuntu 22.04 | 3.10.12 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
Ubuntu 20.04 | 3.8.10 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
Full Changelog: v0.15.0-alpha...v0.17.0-alpha
v0.15.0-alpha
Summary
Changelog for Alpha Release v0.15.0-alpha (notable changes since v0.12.3).
Changelog
Generality
- Implemented support for
10
additional model variants, including1
new architecture for PyTorch framework:DLA
:dla34
,dla46_c
,dla46x_c
,dla60x_c
,dla60
,dla60x
,dla102
,dla102x
,dla102x2
,dla169
,
General (Improvements/Features/Etc.)
- Across-the-board bug fixes and enhancements to compiler stability
- Upgrade of diffusers library from
v0.14.0
tov0.27.2
Notes
Please be aware that this is an Alpha release, and while we've worked diligently, stability across all models and features cannot be guaranteed. Thank you for your understanding.
Compatibility matrix
OS | Python | PyTorch | Driver | Firmware |
---|---|---|---|---|
Ubuntu 22.04 | 3.10.12 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
Ubuntu 20.04 | 3.8.10 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
Full Changelog: v0.12.3...v0.15.0-alpha
v0.12.3
Summary
Changelog for Release v0.12.3 (notable changes since v0.10.5).
Generality
- Implemented support for
5
additional model variants, including2
new architectures:Perceiver IO
HarDNet
with the following variants:- HarDNet 39 Depthwise Separable
- HarDNet 68 Depthwise Separable
- HarDNet 68
- HarDNet 85
Performance
Grayskull (e75, e150)
- Achieved broad performance improvements averaging 30%
- Significantly enhanced performance observed in BERT (102%), ViT (77%), HRNet (65%)
- Notable performance regressions include FLAN-T5 (-23%), YOLOv5 (-12%), Whisper (-2%)
Wormhole (n150)
- Achieved broad performance improvements averaging 52%
- Significantly enhanced performance observed in ResNet-50 (126%), VoVNet (111%), Inception-v4 (111%)
- Without performance degradation observed across all models
General (Improvements/Features/Etc.)
- Across-the-board bug fixes and enhancements to compiler stability
- Enhanced customer-facing documentation for improved clarity and accessibility
- Support for Ubuntu 22.04 as default OS (still supporting Ubuntu 20.04 as secondary)
- Fixed issues where multiple users are running different workloads on the same system
- DeBuda is now a standalone wheel, shipped as part of other release component wheels
Known issues
- Models running on 1x1 grid size might face issues on Intel CPUs
- Running RouteUI using the pybuda wheel might error out with missing library issues. If that happens, this workaround should solve the problem:
export LD_LIBRARY_PATH=<python_env>/lib/python3.8/site-packages/budabackend/build/lib/
Compatibility matrix
OS | Python | PyTorch | Driver | Firmware |
---|---|---|---|---|
Ubuntu 22.04 | 3.10.12 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
Ubuntu 20.04 | 3.8.10 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
What's Changed
- Remove bad merge from test by @vmilosevic in #24
Full Changelog: v0.11.0.gs.240415-alpha...v0.12.3
v0.11.0.gs.240415-alpha
Summary
Changelog for Alpha Release v0.11.0.gs.240415-alpha (notable changes since v0.10.9.gs.240401-alpha).
Changelog
Features/Improvements
- Enhanced customer-facing documentation for improved clarity and accessibility
Generality
- Implemented support for a new architecture
HarDNet
, with the addition of a4
of its variants:- HarDNet 39 Depthwise Separable
- HarDNet 68 Depthwise Separable
- HarDNet 68
- HarDNet 85
Notes
Please be aware that this is an Alpha release, and while we've worked diligently, stability across all models and features cannot be guaranteed. Thank you for your understanding.
Compatibility matrix
OS | Python | PyTorch | Driver | Firmware |
---|---|---|---|---|
Ubuntu 22.04 | 3.10.12 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
Ubuntu 20.04 | 3.8.10 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
What's Changed
- Add build dependencies to github actions by @vmilosevic in #21
New Contributors
- @vmilosevic made their first contribution in #21
Full Changelog: v0.10.9.gs.240401-alpha...v0.11.0.gs.240415-alpha
v0.10.9.gs.240401-alpha
Summary
Changelog for Release v0.10.9.wh_b0.240401-alpha (notable changes since v0.10.5).
Changelog
Features/Improvements
- Across-the-board bug fixes and enhancements to compiler stability
- Support for Ubuntu 22.04 (default is still Ubuntu 20.04)
Generality
- Implemented new architecture for PyTorch framework:
Perceiver IO
.
Compatibility matrix
OS | Python | PyTorch | Driver | Firmware |
---|---|---|---|---|
Ubuntu 22.04 | 3.10.12 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
Ubuntu 20.04 | 3.8.10 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
Full Changelog: v0.10.5.gs.240315...v0.10.9.gs.240401-alpha
v0.10.5.gs.240315
Summary
Changelog for Release v0.10.5 (notable changes since v0.9.80).
To pull this release: docker pull ghcr.io/tenstorrent/tt-buda/ubuntu-20-04-amd64/gs:v0.10.5.gs.240315
Changelog
Features/Improvements
- Across-the-board bug fixes and enhancements to compiler stability
- Added support for ONNX quantized models
Models
- Implemented support for 9 additional model variants, including 2 new architectures:
- Wide ResNet
- Xception
Performance
- Achieved broad performance improvements averaging 18.5%
- Significantly enhanced performance observed in UNet (93%), Inception v4 (57%), and Yolo v5 (44%)
- Notable performance regressions include MobileNet v2 (-23%), Flan T5 (-6%), BERT (-5%)
Libs
- Upgraded our TVM fork to version 01.14
- Introduced standalone build for TorchVision v0.16.0 with cxx11.abi support (refer to documentation for details)
General
- Enhanced customer-facing documentation for improved clarity and accessibility
- Support of Ubuntu 22.04 as a secondary OS option
Compatability matrix
OS | Python | PyTorch | Driver | Firmware |
---|---|---|---|---|
Ubuntu 20.04 | 3.8.10 | 2.1.0+cpu.cxx11.abi | ttkmd_1.27.1 | fw_v80.8.0.0 |
What's Changed
- [doc] Fix BudaLin -> BudaLinear in pybuda example by @artem-erofeev in #9
- [doc] Insert newline in compile PyBUDA from source block by @zhoujingya in #7
New Contributors
- @artem-erofeev made their first contribution in #9
- @zhoujingya made their first contribution in #7
Full Changelog: v0.9.80.gs.231222...v0.10.5.gs.240315
v0.9.80.gs.231222
To pull this release: docker pull ghcr.io/tenstorrent/tt-buda/gs:v0.9.80.gs.231222
Notable changes since v0.9.79
- Improved customer facing documentation
- Added BBE fix for t5 accuracy
Notable changes since v0.9.78
- Bugfix for updated transformers package
Notable changes since v0.9.76
- New models supported: ghostnet, vilt, wideresnet and xception
- Across the board perf improvements (60% BERT, 50% MobilenetV2, 30% ViT and DeiT, 70% YOLOv%, 90% HRNet) and regression on Inception-v4 (30%)
Tested on Ubuntu 20.04 with python 3.8.10 and PyTorch 2.1.0
What's Changed
- Update Communication section README.md by @Shubhamsaboo in #2
- 2312 TT-BUDA release alignment by @milank94 in #3
New Contributors
- @Shubhamsaboo made their first contribution in #2
Full Changelog: v0.9.76.gs.231201...v0.9.80.gs.231222
v0.9.76.gs.231201
To pull this release: docker pull ghcr.io/tenstorrent/tt-buda/gs:v0.9.76.gs.231201
Notable changes since v1.0.75
- Include correct flatbuffer package so install doesn't show errors
- Add base image to release tests
Notable changes since v1.0.74
- Include correct docker file
Notable changes since v1.0.73
- Docker image now includes libgl
Notable changes since v1.0.71
- Buffering ops naming fix
Notable changes since v1.0.69
- Performance improvements: Bert 70%, OpenPose 17%, MobileNetV1 and V3 25%, regression on T5, ViT and DeiT: 15%
Tested on Ubuntu 20.04 with python 3.8.10 and PyTorch 1.10.0
What's Changed
New Contributors
Full Changelog: https://github.com/tenstorrent/tt-buda/commits/v0.9.76.gs.231201