Support longer sequence lengths in `ssm_prefix_scan` #9776

esmalTT · 2024-06-27T12:23:45Z

This PR adds support for L > 32 in ssm_eltwise_mul. Logically we can handle any value of L but values of L > 128 will run out of L1 in bfloat8 format.

This change also addresses issue #9831.

TT-BrianLiu

Minor comments

TT-BrianLiu · 2024-07-02T15:47:14Z

tt_eager/tt_dnn/op_library/transformer_tms/kernels/dataflow/reader_ssm_prefix_scan.cpp

@@ -4,10 +4,28 @@

 #include "dataflow_api.h"

+void fill_zeros(uint32_t cb_id) {
+    constexpr uint32_t num_zeros_reads = 2048 / MEM_ZEROS_SIZE;


Should probably make 2048 a constexpr variable and comment that this is for bfloat16 only. Zeros CB for other data types will be different.

TT-BrianLiu · 2024-07-02T15:48:23Z

.../tt_dnn/op_library/transformer_tms/multi_core_ssm_prefix_scan/multi_core_ssm_prefix_scan.cpp

@@ -88,7 +88,12 @@ operation::ProgramWithCallbacks multi_core_ssm_prefix_scan(
    const uint32_t cb_zeros_id = tt::CB::c_intermed6;
    const auto cb_zeros = create_circular_buffer(cb_zeros_id, 1, intermediary_tile_size, intermediary_format);

-    std::vector<uint32_t> reader_compile_time_args = {cb_a_in_id, cb_bx_in_id};
+    const uint32_t cb_h_acc_id = tt::CB::c_intermed7;
+    const uint32_t num_chunks_per_row = ceil(float(total_tiles_per_row) / 32.0f);


div_up might be cleaner for you without having to do these casts. Also use some variable for 32 const.

This change adds support for L > 32 in `ssm_eltwise_mul`. Logically this op can now handle any value of L but values of L > 128 will run out of L1 in `bfloat8` format. This change also fixes #9831.

esmalTT self-assigned this Jun 27, 2024

esmalTT changed the title ~~Support longer sequence legnths in ssm_prefix_scan~~ Support longer sequence lengths in ssm_prefix_scan Jun 27, 2024

esmalTT added the mamba label Jun 27, 2024

esmalTT force-pushed the esmal/prefix-scan-support-larger-seq branch 4 times, most recently from 6d4551f to 1f6f645 Compare June 30, 2024 12:24

esmalTT linked an issue Jun 30, 2024 that may be closed by this pull request

pytest tests/tt_eager/python_api_testing/unit_testing/misc/test_ssm_prefix_scan.py::test_ssm_reduce[32-32-32-1-dtype0] fails without L1 clear #9831

Closed

esmalTT requested review from kpaigwar and TT-BrianLiu June 30, 2024 12:54

esmalTT marked this pull request as ready for review June 30, 2024 12:54

esmalTT temporarily deployed to dev June 30, 2024 12:56 — with GitHub Actions Inactive

esmalTT temporarily deployed to dev June 30, 2024 13:03 — with GitHub Actions Inactive

esmalTT force-pushed the esmal/prefix-scan-support-larger-seq branch 3 times, most recently from fc80603 to d62da2c Compare June 30, 2024 14:32

kpaigwar approved these changes Jul 1, 2024

View reviewed changes

TT-BrianLiu approved these changes Jul 2, 2024

View reviewed changes

#0: Support longer sequence legnths in ssm_prefix_scan

d5da023

This change adds support for L > 32 in `ssm_eltwise_mul`. Logically this op can now handle any value of L but values of L > 128 will run out of L1 in `bfloat8` format. This change also fixes #9831.

esmalTT force-pushed the esmal/prefix-scan-support-larger-seq branch from d62da2c to d5da023 Compare July 2, 2024 17:30

esmalTT merged commit 4a8f2da into main Jul 2, 2024
5 checks passed

esmalTT deleted the esmal/prefix-scan-support-larger-seq branch July 2, 2024 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support longer sequence lengths in `ssm_prefix_scan` #9776

Support longer sequence lengths in `ssm_prefix_scan` #9776

esmalTT commented Jun 27, 2024 •

edited

Loading

TT-BrianLiu left a comment

TT-BrianLiu Jul 2, 2024

TT-BrianLiu Jul 2, 2024

Support longer sequence lengths in ssm_prefix_scan #9776

Support longer sequence lengths in ssm_prefix_scan #9776

Conversation

esmalTT commented Jun 27, 2024 • edited Loading

TT-BrianLiu left a comment

Choose a reason for hiding this comment

TT-BrianLiu Jul 2, 2024

Choose a reason for hiding this comment

TT-BrianLiu Jul 2, 2024

Choose a reason for hiding this comment

Support longer sequence lengths in `ssm_prefix_scan` #9776

Support longer sequence lengths in `ssm_prefix_scan` #9776

esmalTT commented Jun 27, 2024 •

edited

Loading