Release v0.54.0-rc23 · tenstorrent/tt-metal

Note

If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.

The changelog will now follow, showing the changes from last release.

This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/12759327887

📦 Uncategorized

Isolate tracy
- PR: #16161
#5605: Only force-stall ethernet programs on earlier ethernet programs
- PR: #16202
#14976/#15039: Add Support For ceil_mode=True
- PR: #16124
Add missing cache invalidates + loads before stores noc optimization for BH
- PR: #16037
Initial CCL Rewrite Push (Unblocks Parallelization of Efforts and Some TG Llama integration)
- PR: #16026
New FD Init Flow
- PR: #15406
Add support for output sharded embeddings
- PR: #16237
Revert "#5605: Only force-stall ethernet programs on earlier ethernet programs"
- PR: #16257
#0: Enforce tile layout when using bf4/bf8 data types
- PR: #16199
MeshDevice: Support Quanta Galaxy system file
- PR: #16239
Move Device members from public to private
- PR: #16256
Add unary sharded sweeps
- PR: #15300
#0: Added core_grid offset for sharded layernorm
- PR: #16207
fix abs path bug for sweeps tests code
- PR: #16285
#0: Publish TT-Distributed doc under tech_reports
- PR: #16261
#15061: Extended {to,from}_vector to support tilized layout, bf4/8 formats
- PR: #16105
#16265: Remove creation op
- PR: #16269
Fix unsigned arithmetic bugs in reshape ops
- PR: #16253
Fix compile issue for earlier c++ versions
- PR: #16291
#0: Typo fix in TT distributed tech report
- PR: #16308
[Llama3-text vLLM integration] Modify Llama3 text model (new and old codebase) forward apis for vLLM compatibility
- PR: #16292
LLM tech report sections 3.1, 3.4, 3.5
- PR: #15110
LLM Tech report section 4.4
- PR: #15166
Move some Device methods to private section
- PR: #16259
#0: [skip_ci] Update Distributed Tech Report with Discord Server link
- PR: #16314
#15857: Binary Forge Sweep Tests Set1
- PR: #16042
#0: Fix get_dispatch_core_config in conftest.py to not modify the device_params to not affect subsequent tests
- PR: #16290
#0: Remove hardcoded grid width in all_gather and skip test_sharded_matmul test when the device grid size is too small
- PR: #16315

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.54.0-rc23

📦 Uncategorized