Skip to content

v0.54.0-rc23

Pre-release
Pre-release
Compare
Choose a tag to compare
@github-actions github-actions released this 14 Jan 02:06
· 331 commits to main since this release

Note

If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.

The changelog will now follow, showing the changes from last release.

This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/12759327887

📦 Uncategorized

  • Isolate tracy
  • #5605: Only force-stall ethernet programs on earlier ethernet programs
  • #14976/#15039: Add Support For ceil_mode=True
  • Add missing cache invalidates + loads before stores noc optimization for BH
  • Initial CCL Rewrite Push (Unblocks Parallelization of Efforts and Some TG Llama integration)
  • New FD Init Flow
  • Add support for output sharded embeddings
  • Revert "#5605: Only force-stall ethernet programs on earlier ethernet programs"
  • #0: Enforce tile layout when using bf4/bf8 data types
  • MeshDevice: Support Quanta Galaxy system file
  • Move Device members from public to private
  • Add unary sharded sweeps
  • #0: Added core_grid offset for sharded layernorm
  • fix abs path bug for sweeps tests code
  • #0: Publish TT-Distributed doc under tech_reports
  • #15061: Extended {to,from}_vector to support tilized layout, bf4/8 formats
  • #16265: Remove creation op
  • Fix unsigned arithmetic bugs in reshape ops
  • Fix compile issue for earlier c++ versions
  • #0: Typo fix in TT distributed tech report
  • [Llama3-text vLLM integration] Modify Llama3 text model (new and old codebase) forward apis for vLLM compatibility
  • LLM tech report sections 3.1, 3.4, 3.5
  • LLM Tech report section 4.4
  • Move some Device methods to private section
  • #0: [skip_ci] Update Distributed Tech Report with Discord Server link
  • #15857: Binary Forge Sweep Tests Set1
  • #0: Fix get_dispatch_core_config in conftest.py to not modify the device_params to not affect subsequent tests
  • #0: Remove hardcoded grid width in all_gather and skip test_sharded_matmul test when the device grid size is too small