chore(bench): add bench for json parser #9195

tabVersion · 2023-04-14T10:19:50Z

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Checklist For Contributors

as title, the pr focuses on comparing cases that column names that do not match json keys

~~- [ ] I have written necessary rustdoc comments~~
~~- [ ] I have added necessary unit tests and integration tests~~
~~- [ ] I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features #7934).~~
~~- [ ] I have demonstrated that backward compatibility is not broken by breaking changes and created issues to track deprecated features to be removed in the future. (Please refer to the issue)~~

All checks passed in ./risedev check (or alias, ./risedev c)

Checklist For Reviewers

I have requested macro/micro-benchmarks as this PR can affect performance substantially, and the results are shown.

Documentation

My PR DOES NOT contain user-facing changes.

Click here for Documentation

Types of user-facing changes

Please keep the types that apply to your changes, and remove the others.

Installation and deployment
Connector (sources & sinks)
SQL commands, functions, and operators
RisingWave cluster configuration changes
Other (please specify in the release note below)

Release note

Signed-off-by: tabVersion <[email protected]>

codecov · 2023-04-14T10:44:12Z

Codecov Report

Merging #9195 (6476bd2) into main (a2fe5aa) will increase coverage by 0.00%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #9195   +/-   ##
=======================================
  Coverage   70.81%   70.82%           
=======================================
  Files        1200     1200           
  Lines      200245   200245           
=======================================
+ Hits       141813   141822    +9     
+ Misses      58432    58423    -9

Flag	Coverage Δ
rust	`70.82% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 3 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

src/connector/benches/parser.rs

kwannoel

LGTM as a starting point for benchmarking json parser. Thanks for this!

I'm curious about source::benches::json_parser.rs though. Is that benchmark still relevant then?

Signed-off-by: tabVersion <[email protected]>

tabVersion · 2023-04-14T11:30:48Z

LGTM as a starting point for benchmarking json parser. Thanks for this!

I'm curious about source::benches::json_parser.rs though. Is that benchmark still relevant then?

Yes, it is still ok to benchmark json parser perf for general purposes. In this pr, because I suspect the regression is introduced in #9086, it focuses on comparing cases that column names that do not match json keys.

tabVersion · 2023-04-14T11:40:23Z

run on local machine: Apple M1 silicon, 32G RAM

ParseMatchCase/32       time:   [673.13 ms 675.36 ms 677.62 ms]

ParseMatchCase/128      time:   [638.88 ms 644.43 ms 653.78 ms]

ParseMatchCase/512      time:   [646.35 ms 652.50 ms 660.39 ms]

ParseMatchCase/1024     time:   [638.15 ms 640.67 ms 643.56 ms]

ParseMatchCase/2048     time:   [632.13 ms 633.57 ms 635.16 ms]

ParseMatchCase/4096     time:   [626.47 ms 627.26 ms 628.06 ms]

ParseMisMatchCase/32    time:   [712.50 ms 715.36 ms 718.23 ms]

ParseMisMatchCase/128   time:   [681.59 ms 685.19 ms 689.48 ms]

ParseMisMatchCase/512   time:   [669.22 ms 670.75 ms 672.17 ms]

ParseMisMatchCase/1024  time:   [669.12 ms 670.78 ms 672.34 ms]

ParseMisMatchCase/2048  time:   [671.74 ms 679.59 ms 690.52 ms]

ParseMisMatchCase/4096  time:   [675.99 ms 682.49 ms 692.14 ms]

seems there gonna be an extra 40ms+(5%) if column names do not match json keys. maybe it can explain the observation in #9113

kwannoel · 2023-04-15T02:48:20Z

run on local machine: Apple M1 silicon, 32G RAM

ParseMatchCase/32       time:   [673.13 ms 675.36 ms 677.62 ms]

ParseMatchCase/128      time:   [638.88 ms 644.43 ms 653.78 ms]

ParseMatchCase/512      time:   [646.35 ms 652.50 ms 660.39 ms]

ParseMatchCase/1024     time:   [638.15 ms 640.67 ms 643.56 ms]

ParseMatchCase/2048     time:   [632.13 ms 633.57 ms 635.16 ms]

ParseMatchCase/4096     time:   [626.47 ms 627.26 ms 628.06 ms]

ParseMisMatchCase/32    time:   [712.50 ms 715.36 ms 718.23 ms]

ParseMisMatchCase/128   time:   [681.59 ms 685.19 ms 689.48 ms]

ParseMisMatchCase/512   time:   [669.22 ms 670.75 ms 672.17 ms]

ParseMisMatchCase/1024  time:   [669.12 ms 670.78 ms 672.34 ms]

ParseMisMatchCase/2048  time:   [671.74 ms 679.59 ms 690.52 ms]

ParseMisMatchCase/4096  time:   [675.99 ms 682.49 ms 692.14 ms]

seems there gonna be an extra 40ms+(5%) if column names do not match json keys. maybe it can explain the observation in #9113

Hmm not entirely, but can be a contributing factor. E2e was 6% but just json parser alone regression is like 60% (11%/18%=0.6), so 5% doesnt seem to explain it fully.

Flamegraph percentages probably not too reliable though, only rough indicators. We can probably can checkout the old main branch where 11% is observed and run the json parser micro benchmarks, get a more precise figure.

add bench for json parser

9f9ccfe

Signed-off-by: tabVersion <[email protected]>

github-actions bot added the type/chore label Apr 14, 2023

tabVersion mentioned this pull request Apr 14, 2023

perf: optimize json parser #9113

Closed

fix

ae35518

Signed-off-by: tabVersion <[email protected]>

kwannoel reviewed Apr 14, 2023

View reviewed changes

src/connector/benches/parser.rs Outdated Show resolved Hide resolved

kwannoel reviewed Apr 14, 2023

View reviewed changes

src/connector/benches/parser.rs Outdated Show resolved Hide resolved

kwannoel reviewed Apr 14, 2023

View reviewed changes

src/connector/benches/parser.rs Outdated Show resolved Hide resolved

kwannoel reviewed Apr 14, 2023

View reviewed changes

src/connector/benches/parser.rs Outdated Show resolved Hide resolved

kwannoel approved these changes Apr 14, 2023

View reviewed changes

fix

6476bd2

Signed-off-by: tabVersion <[email protected]>

tabVersion closed this Apr 14, 2023

tabVersion reopened this Apr 14, 2023

tabVersion enabled auto-merge April 14, 2023 11:42

tabVersion added this pull request to the merge queue Apr 14, 2023

Merged via the queue into main with commit de7b5d4 Apr 14, 2023

tabVersion deleted the tab/parser-bench branch April 14, 2023 12:13

BugenZhao mentioned this pull request Oct 26, 2023

perf(connector): add integration benchmark for nexmark parsing #13073

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(bench): add bench for json parser #9195

chore(bench): add bench for json parser #9195

tabVersion commented Apr 14, 2023 •

edited

Loading

codecov bot commented Apr 14, 2023 •

edited

Loading

kwannoel left a comment

tabVersion commented Apr 14, 2023 •

edited

Loading

tabVersion commented Apr 14, 2023

kwannoel commented Apr 15, 2023

chore(bench): add bench for json parser #9195

chore(bench): add bench for json parser #9195

Conversation

tabVersion commented Apr 14, 2023 • edited Loading

What's changed and what's your intention?

Checklist For Contributors

Checklist For Reviewers

Documentation

Types of user-facing changes

Release note

codecov bot commented Apr 14, 2023 • edited Loading

Codecov Report

kwannoel left a comment

Choose a reason for hiding this comment

tabVersion commented Apr 14, 2023 • edited Loading

tabVersion commented Apr 14, 2023

kwannoel commented Apr 15, 2023

tabVersion commented Apr 14, 2023 •

edited

Loading

codecov bot commented Apr 14, 2023 •

edited

Loading

tabVersion commented Apr 14, 2023 •

edited

Loading