v0.51.0-rc15
Pre-release
Pre-release
github-actions
released this
29 Jul 02:09
·
2809 commits
to main
since this release
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/10136578200
📦 Uncategorized
- #10082: Migrate unary bw ops to TTNN and remove std::function
- PR: #10239
- #9715: Use build artifacts for profiler tests
- PR: #10218
- #9021: adding resnet api into ci.
- PR: #10008
- Update README.md
- PR: #10247
- Move pad_on_host/unpad_on_host to host function in TTNN
- PR: #10178
- #9874: Move polygamma_bw to TTNN
- PR: #10146
- #5337: increase t3k frequent test timeout
- PR: #10202
- Update falcon40b readme
- PR: #10261
- #0: add layernorm rmsnorm pybind, move to ttnn
- PR: #10012
- #0: Re-enable read cache in llama_model_optimized.
- PR: #10208
- Update Mistral/Mixtral README files
- PR: #10259
- #0: Update LLama2/3 readme with demo details
- PR: #10263
- #0: resnet perf fix
- PR: #10273
- Update Mamba README.md
- PR: #10262
- OPT convs in RN50 to get better device perf
- PR: #10279
- Increase timeout for N300 WH-only model pipeline
- PR: #10287
- Prefill+Decode Demo Functional Implementation
- PR: #10281
- [Falcon7b] Add wormhole demo perf mode and output verification tests
- PR: #10269
- Update Falcon7/40b READMEs with details on model functionality and perf-mode
- PR: #10290
- bump python 3.8 venv package version
- PR: #10315
- Git bisect workflow on CI runners
- PR: #10316
- #9613: scaffolding for weekly scheduled t3k perplexity tests
- PR: #10142
- fix syntax issue with bisect script
- PR: #10328
- #10231: Clean up t3k runs-on tags to minimum
- PR: #10232
- #9490: Remove tt_eager unary ops and bindings
- PR: #10194
- only build for arch that a dispatched workflow is running for
- PR: #10318
- Allow overloading of job name with user-defined name for new dispatch workflows
- PR: #10331
- #10242: Migrate unary bw ops with a generalized structure to TTNN
- PR: #10243
- #10322: commented out failing t3k tests
- PR: #10327
- #9491: Add structure for ternary ops in ttnn
- PR: #10240
- Move downsample from tt_eager to ttnn
- PR: #9951
- #10250: Migrate unary backward ops with a generalized structure to TTNN
- PR: #10253
- #10280: Mistral README update
- PR: #10309
- #9911: Add structure and migrate 20 composite unary ops
- PR: #9913
- #0: fix rn50 block padding
- PR: #10329
- #10300: get the correct operation id on subsequent run
- PR: #10303
- #0: Move host tensor construction for halo into create_program to only happen on uncached runs
- PR: #10221
- Flash decode v2
- PR: #10313
- #9751: Restructure ttnn transformers to new folder structure
- PR: #10353
- #10181: Disable test_reduce_h due to sporadic failures in slow dispatch
- PR: #10359
- #10181: Disable
test_reduce_h
- PR: #10362
- Update README.md
- PR: #10369
- move groupnorm from ttlib to ttnn
- PR: #10363
- Update README.md
- PR: #10370
- Update README.md - missing footnote
- PR: #10372
- #0: Update ttnn resnet 2cq bound due to variability
- PR: #10368
- #7528: add new ethernet microbenchmark, cleanup and re-enable others
- PR: #9966
- #10238: migrate 7 unary ops into ttnn
- PR: #10264
- #10333: Migrate prod_bw to TTNN
- PR: #10345
- #10320: Enable falcon40b tests again
- PR: #10387
- Add fused layernorm to falcon40b
- PR: #9502
- #8342: Add info to matmul that tensors need to be on device
- PR: #10260
- #10254: Enable preserve_fp32_precision flag in moreh_sum op
- PR: #10265
- #10305: Add INSTALLING.md to release assets and create new custom release notes with an installation and pipeline ID
- PR: #10377
- #9901: Refactoring
moreh norm
- PR: #10255
- Ngrujic/profiling
- PR: #10268
- #9747: Implement ttnn.tilize(_with_val_padding) Python bindings
- PR: #10289
- Add fixture for checking if in CI env and invoke Falcon7b demo tests with only filename
- PR: #10371
- Add native caching for Mamba convolution/hidden states
- PR: #10398
- Add skip-first option to op perf results script
- PR: #10390
- #10083: added unit tests for JSON serialization
- PR: #10358
- #10323: Reenable Llama perf test in CI
- PR: #10410
- #9747: Delete tilize ops from tt_eager
- PR: #10406
- #8764: More docs changes for WH readiness, Part 5
- PR: #9403
- Move Mamba embeddings onto device
- PR: #10414
- #10257: Add ttnn binding for UnaryWithParam, UnaryOpType
- PR: #10258
- #10224: Update offsets for GO signal commands to use sizeof prefetch/dispatch cmd rather than pcie aligned size
- PR: #10357
- #10166: add device mesh apis to query by row and col
- PR: #10417
- #10380: Migrate set 1,2 complex ops to TTNN
- PR: #10383
- #9527: continue removing bcast
- PR: #10207
- #10334: Migrate 7 Type 2 unary complex bw ops to TTNN
- PR: #10336
- #9806: Migrate Complex binary backward ops to TTNN
- PR: #10003
- Relu max sweep migration - TTLIB to TTNN
- PR: #10338
- Refactor: common RMSNorm for Mixtral and Mistral
- PR: #10354
- #9874: Update clamp_bw to match PyTorch API
- PR: #10393
- [Falcon7b] Add perplexity tests to new pipeline and restructure pytests to invoke with filenames
- PR: #10355
- #10322: Re-enable Mixtral CI tests due to corrupted cache in CI machine
- PR: #10433
- Migration of relu_min from tt_eager to ttnn
- PR: #10431
- #10147: Migrate addcmul_bw to ttnn
- PR: #10436
- #10403: Use is_ci_env fixture instead of env variable
- PR: #10442
- Optimized MLP with W/H fracturing, sharding and ReduceScatter
- PR: #10043
- Minor refactoring
- PR: #10447
- Revert "Migration of relu_min from tt_eager to ttnn"
- PR: #10451
- Add falcon40b demo test with token matching
- PR: #10400
- #10130: Delete scan op in favor of ssm_prefix_scan
- PR: #10418
- Support any sequence length in Mamba prefill
- PR: #10174
- #10200: Update umd for mmio flush array overrun bugfix
- PR: #10466
- #10467: Move tt_eager folder content into ttnn/experimental
- PR: #10424
- Mixtral prefill 128-32k
- PR: #9907
- #8865: Update reference times for dispatch time measuring
- PR: #10430
- Migrate unpad sweep to TTNN
- PR: #10434
- Update demo token matching reference for falcon40b
- PR: #10478
- TTNN fmod sweeps added
- PR: #10339
- #10147: Migrated eltwise_relu_min to ttnn
- PR: #10481
- #0: Upgrade WH and T3000 WH KMD and FW versions to v1.27.1 and v80.10.0.0 respectively
- PR: #7999
- #0: Move concatenate heads into ttnn experimental
- PR: #10409
- #9628: Update Test files with golden function
- PR: #10480
- #9490: Remove eltwise_unary in tt_eager
- PR: #10438
- #10137: Add structure for composite binary ops in ttnn
- PR: #10138
- Fix/re-enable a few watcher tests
- PR: #10491
- #10471: Fixed GCC13 compile time issue
- PR: #10473
- #7887: remove deprecated device_pool
- PR: #10456
- Add llama galaxy mlp to TG frequent tests
- PR: #10274
- #0: Update CODEOWNERS
- PR: #10490
- Move Mamba demo to models/demos/wormhole
- PR: #10461
- #10180: Use last column for FD on BH
- PR: #10465
- #10052: [Blackhole bringup] Add pack untilize
- PR: #10422
- #9486: Merge CCL line_all_gather to TTNN
- PR: #9909
- #9628: Remove std::function for BW Binary ops
- PR: #10492
- #9628: Update Unary bw test files with golden function
- PR: #10489
- #10382: Migrate set 3,4 complex ops to TTNN
- PR: #10423
- Move conv to ttnn
- PR: #10148
- Move pool to TTNN
- PR: #9855
- TTNN scale mask softmax and softmax in-place sweep migration
- PR: #10435
- #9628: Remove std::function for BW Binary ops
- PR: #10502
- Replace all TT Lib Permute Uses with TTNN and remove old bindings
- PR: #10312