Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(stream): Test no_shuffle_backfill vs arrangement_backfill performance #14593

Merged
merged 1 commit into from
Jan 17, 2024

Conversation

kwannoel
Copy link
Contributor

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

As per title. Just compare their e2e test runtimes.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@github-actions github-actions bot added the component/test Test related issue. label Jan 16, 2024
@kwannoel kwannoel force-pushed the kwannoel/backfill-runtime branch from 3f77d35 to 6e67b52 Compare January 16, 2024 07:59
@kwannoel
Copy link
Contributor Author

Screenshot 2024-01-16 at 5 06 26 PM

Performance of arrangement backfill seems acceptable.

Copy link
Contributor

@chenzl25 chenzl25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance of arrangement backfill is superior to the no-shuffle one. What specific factors contribute to this improvement? Could it be the per-vnode Snapshot-read?

@kwannoel
Copy link
Contributor Author

kwannoel commented Jan 17, 2024

The performance of arrangement backfill is superior to the no-shuffle one. What specific factors contribute to this improvement? Could it be the per-vnode Snapshot-read?

Not sure yet, have to see the flamegraph of the tests. Will do it later. But that's certainly one of the major differences.

@kwannoel kwannoel added this pull request to the merge queue Jan 17, 2024
Merged via the queue into main with commit 217ff0b Jan 17, 2024
27 of 28 checks passed
@kwannoel kwannoel deleted the kwannoel/backfill-runtime branch January 17, 2024 02:46
@fuyufjh
Copy link
Member

fuyufjh commented Jan 17, 2024

The performance of arrangement backfill is superior to the no-shuffle one.

+1. Curious why it can happen. Please share the analysis result ❤️

@kwannoel
Copy link
Contributor Author

kwannoel commented Jan 17, 2024

Seems like it runs better, only in debug mode (PR pipeline runs in debug mode sadly...), and significantly worse in release mode. Investigating why.

Recent tests on c5.4x large instance (debug):

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 47 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 102708 ms
e2e_test/backfill/runtime/create_arrangement_backfill_mv.slt .. [OK] in 53432 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 15075 ms
e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 47 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 102973 ms
e2e_test/backfill/runtime/create_no_shuffle_mv.slt           .. [OK] in 162020 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 15482 ms

no shuffle mv takes 3x more time in debug mode.

In release mode however, the results are a lot more different:

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 16 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 4059 ms
e2e_test/backfill/runtime/create_no_shuffle_mv.slt           .. [OK] in 2270 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 758 ms

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 15 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 3961 ms
e2e_test/backfill/runtime/create_arrangement_backfill_mv.slt .. [OK] in 2288 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 796 ms

And with 2x the number of records (2M), the results show a large discrepancy, where no shuffle backfill is better.

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 18 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 8790 ms
e2e_test/backfill/runtime/create_arrangement_backfill_mv.slt .. [OK] in 14300 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 1943 ms

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 15 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 8817 ms
e2e_test/backfill/runtime/create_no_shuffle_mv.slt           .. [OK] in 4219 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 1888 ms

With 2M records, with 3cn rather than 1cn:

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 16 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 8583 ms
e2e_test/backfill/runtime/create_arrangement_backfill_mv.slt .. [OK] in 10733 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 1658 ms

 e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 15 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 8681 ms
e2e_test/backfill/runtime/create_no_shuffle_mv.slt           .. [OK] in 3898 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 1888 ms

@kwannoel
Copy link
Contributor Author

With 1cn, 8M records:

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 16 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 41118 ms
e2e_test/backfill/runtime/create_arrangement_backfill_mv.slt .. [OK] in 71616 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 7569 ms

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 17 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 40555 ms
e2e_test/backfill/runtime/create_no_shuffle_mv.slt           .. [OK] in 16009 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 9837 ms

@fuyufjh
Copy link
Member

fuyufjh commented Jan 18, 2024

Sad...

BTW,

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 18 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 8790 ms
e2e_test/backfill/runtime/create_arrangement_backfill_mv.slt .. [OK] in 14300 ms
e2e_test/backfill/runtime/validate_rows.slt                  .. [OK] in 1943 ms

As the log shows, the create MV is executed after insert, so this test mostly reveal the performance of backfilling instead of passing upstream events, right?

If yes, then the next question is: why the backfilling performance diffs so much.

@kwannoel
Copy link
Contributor Author

Just re-ran with snapshot read:
Screenshot 2024-01-18 at 12 01 27 PM

Looks like the peak for no shuffle is much higher 400K r/s for arrangement backfill it's a mere 150K r/s.

The main difference is snapshot read is not done in parallel, rather in sequence, per vnode.
If we look at the snapshot throughput metrics in grafana it seems to confirm it:

Trying a separate implementation which does snapshot read in parallel per vnode.

@kwannoel
Copy link
Contributor Author

Slight improvement, before the change and after:

e2e_test/backfill/runtime/create_no_shuffle_mv.slt           .. [OK] in 66726 ms
e2e_test/backfill/runtime/create_arrangement_backfill_mv.slt .. [OK] in 152346 ms

e2e_test/backfill/runtime/create_no_shuffle_mv.slt           .. [OK] in 58413 ms
e2e_test/backfill/runtime/create_arrangement_backfill_mv.slt .. [OK] in 117353 ms

@kwannoel
Copy link
Contributor Author

kwannoel commented Jan 18, 2024

After adding prefetch as well:

e2e_test/backfill/runtime/create_table.slt                   .. [OK] in 16 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 111406 ms
e2e_test/backfill/runtime/insert.slt                         .. [OK] in 126383 ms
e2e_test/backfill/runtime/create_arrangement_backfill_mv.slt .. [OK] in 39760 ms
e2e_test/backfill/runtime/create_no_shuffle_mv.slt           .. [OK] in 63402 ms
e2e_test/backfill/runtime/validate_rows.slt                  ..

Opening a PR with the optimizations.

The reason why create_arrangement_backfill_mv.slt is more performant I suppose, is because we run iter_chunks in parallel, per vnode, and we don't incur the cost of merging rows into chunks across vnodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants