perf: nexmark q0 #8712

kwannoel · 2023-03-22T09:42:12Z

Background

In recent benchmark Flink had average throughput of 1M r/s. RW had average throughput of 850K r/s. Requires a 17% improvement to match Flink. Thanks to @huangjw806 for spotting this.

Flamegraph can be found here, under Artifacts.

query

    CREATE SINK nexmark_q0
    AS
    SELECT auction, bidder, price, date_time
    FROM bid
    WITH ( connector = 'blackhole', type = 'append-only');

plan

   QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 StreamSink { type: append-only, columns: [auction, bidder, price, date_time] }
 └─StreamProject { exprs: [$expr1, $expr2, $expr3, $expr4] }
   └─StreamProject { exprs: [Field(bid, 0:Int32) as $expr1, Field(bid, 1:Int32) as $expr2, Field(bid, 2:Int32) as $expr3, Field(bid, 5:Int32) as $expr4, _row_id] }
     └─StreamFilter { predicate: (event_type = 2:Int32) }
       └─StreamRowIdGen { row_id_index: 5 }
         └─StreamSource { source: "nexmark", columns: ["event_type", "person", "auction", "bid", "_rw_kafka_timestamp", "_row_id"] }

Here are screenshots of the flamegraph, highlighting cost centers.

The text was updated successfully, but these errors were encountered:

kwannoel · 2023-03-22T09:44:23Z

Please append to screenshots if you find something interesting in the flamegraph.

lmatz · 2023-03-22T12:15:00Z

https://github.com/risingwavelabs/risingwave/blob/main/src/common/src/array/mod.rs#L500-L514

append_datum_n itself takes a non-negligible amount of time, does it make sense?

lmatz · 2023-03-22T12:35:23Z

https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/parser/json_parser.rs#L94-L109

let's save the to_vec.

Edit:
Done in #8732

lmatz · 2023-03-22T12:38:00Z

https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/parser/json_parser.rs#L122

done to_ascii_lowercase() once in advance, out of the closure/loop?

Edit:
Done in #8718

lmatz · 2023-03-22T12:42:26Z

https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/parser/mod.rs#L200

Vec::with_capacity() instead of vec![]?

Edit:
Done in #8718

lmatz · 2023-03-23T06:04:23Z

I feel we need to find a good strategy to decide when to compact in the ProjectExecutor
https://github.com/risingwavelabs/risingwave/blob/main/src/stream/src/executor/project.rs#L113

In this case, as the expr is really computation-light, compact itself introduces 11% overhead.

Edit:
Probably not, after all, we need to output those visible ones only to an external sink, so we have to do some compaction somewhere before the final stage.

kwannoel · 2023-03-23T08:32:17Z

Probably not, after all, we need to output those visible ones only to an external sink, so we have to do some compaction somewhere before the final stage.

Not sure I understand why compaction is required in this case. Why can't we just output visible rows to external sink?

lmatz · 2023-03-23T08:38:40Z

https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/sink/remote.rs#L276-L294

Kind of depending on the output format requirement

if it asks for JSON, then we iterate through each row of the chunk, so we can choose not to compact.
But if it asks for stream chunk format, then we have to compact.

Since this is a blackhole sink, we can save the compaction, you are right!

So whether to compact becomes something to determine at the stage of optimization?

kwannoel · 2023-03-23T08:42:53Z

https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/sink/remote.rs#L276-L294

Kind of depending on the output format requirement

if it asks for JSON, then we iterate through each row of the chunk, so we can choose not to compact. But if it asks for stream chunk format, then we have to compact.

Since this is a blackhole sink, we can save the compaction, you are right!

Hmm even in StreamChunk we can defer it to to_protobuf?

risingwave/src/connector/src/sink/remote.rs

Lines 290 to 294 in 96aa23d

    
           SinkPayloadFormat::StreamChunk => { 
        
               let prost_stream_chunk = chunk.to_protobuf(); 
        
               let binary_data = Message::encode_to_vec(&prost_stream_chunk); 
        
               Payload::StreamChunkPayload(StreamChunkPayload { binary_data }) 
        
           }

lmatz · 2023-03-23T08:45:50Z

I think so, but this prost_stream_chunk seems not aware of visibility

kwannoel · 2023-03-23T08:46:42Z

But typically will any system consume StreamChunk even? 👀
I thought it is used internally

lmatz · 2023-03-23T08:47:24Z

But typically will any system consume StreamChunk even? 👀

I guess no, so we can save the final compact if it is connected to the sink/MV executor

lmatz · 2023-03-23T08:51:44Z

For this particular query, we need to be cautious because of this two project issue.
It's a bug that has not been fixed.

The project below actually compacts in this case, so

save the final compact

is a wrong statement here, which is a correct one if we fixed this particular bug first.

kwannoel · 2023-03-23T08:51:53Z

So whether to compact becomes something to determine at the stage of optimization?

Makes sense to me.

kwannoel · 2023-03-23T08:54:00Z

For this particular query, we need to be cautious because of this two project issue. It's a bug that has not been fixed.

The project below actually compacts in this case, so

save the final compact

is a wrong statement here, which is a correct one if we fixed this particular bug first.

Linking it: #8577

lmatz · 2023-03-23T08:59:32Z

#8577 (comment)

Which means we actually do no need to have StreamRowIdGen { row_id_index: 5 } right? @st1page

ok, not much overhead though from flamegraph

lmatz · 2023-03-23T10:53:13Z

Guess SourceStreamChunkRowWriter is not efficient enough for those insert-only sources 🤔

st1page · 2023-03-24T08:03:52Z

I feel we need to find a good strategy to decide when to compact in the ProjectExecutor
https://github.com/risingwavelabs/risingwave/blob/main/src/stream/src/executor/project.rs#L113
In this case, as the expr is really computation-light, compact itself introduces 11% overhead.

done in #8758

But typically will any system consume StreamChunk even? eyes

I am not sure if we can sink the chunk into a system with arrow format 🤔

For this particular query, we need to be cautious because of this two project issue.
It's a bug that has not been fixed.
The project below actually compacts in this case, so

save the final compact

is a wrong statement here, which is a correct one if we fixed this particular bug first.

I think it can not help the compact performance issue because it will be compact in any project. If we have a plan ProjA->ProjB, the chunk will be compacted in the projA and the the ProjB::compact() will not be costly

kwannoel · 2023-03-24T08:52:15Z

I think the idea I have is slightly different, it's more to avoid compact even when there's filter, as in the case of q0.

When sinking, we don't need to build new chunk, instead we build protobuf / json encoded chunk. We can delay the top-most compact call until here, saving cost of building a new chunk, and relying on protobuf / json building step to remove invisible rows.

@st1page 's approach is still needed, as a general optimization for when to compact. This approach is complementary, it will always eliminate top-most compact when sinking regardless of selectivity, to avoid unnecessarily building a new chunk.

q0's compact call can be optimized via this complementary approach.

To implement this as a optimization requires a bit of refactoring to add should_compact as a plan-level attribute (or other suggestions?). A simple thing to do for now is just disable compact for top-most project, if sinking, since that's usually the most common case.

st1page · 2023-03-24T10:39:52Z

When sinking, we don't need to build new chunk, instead we build protobuf / json encoded chunk. We can delay the top-most compact call until here, saving cost of building a new chunk, and relying on protobuf / json building step to remove invisible rows.

strongly +1.

To implement this as a optimization requires a bit of refactoring to add should_compact as a plan-level attribute (or other suggestions?). A simple thing to do for now is just disable compact for top-most project, if sinking, since that's usually the most common case.

I think is not a plan-level attribute. Currently, we do compact the input chunk just to simplify the executor's implementation. But in fact, every executor should handle the visibility properly. e.g.

for Project, the executor should trade off between constructing a new chunk or the redundant row's computing(because our expression framework can not accept visibility)
for Agg, the executor should can ignore the invisible rows
for SinkExecutor, it should make sure the visibility can be properly handled by the SinkImpl as you say.

The question here is that we give the chunk's visibility to SinkExecutor which means it has the chance to do the optimization but it does not. So we need to do optimization in the SinkExecutor.

lmatz · 2023-03-28T00:28:55Z

The peak throughput gets to 1M rows/s.
Luckily, the imbalanced source throughput problem didn't happen today.

I guess the left two things are:

avoid the unnecessary compact
optimize SourceStreamChunkRowWriter, or let's say have a customized code path for insert-only sources. 🤔 At least it can avoid the data type match for every Datum I suppose.

lmatz · 2024-05-28T08:15:01Z

link #14815

kwannoel added the type/feature label Mar 22, 2023

github-actions bot added this to the release-0.19 milestone Mar 22, 2023

kwannoel mentioned this issue Mar 22, 2023

Tracking: Nexmark queries optimization #7289

Open

54 tasks

This was referenced Mar 22, 2023

perf(parser): do to_ascii_lowercase only once #8718

Merged

perf: do not compact when there is no invisible row #8720

Closed

lmatz mentioned this issue Mar 23, 2023

perf(connector): use Vec<u8> instead of Bytes and &[u8] #8732

Merged

3 tasks

lmatz mentioned this issue Mar 24, 2023

feat: StreamProject Merge for two phase plans #8742

Closed

kwannoel mentioned this issue Mar 24, 2023

feat(optimizer): Add StreamProjectMergeRule #8753

Merged

7 tasks

st1page mentioned this issue Mar 24, 2023

perf(stream): add simple strategy for if the stream project compact the chunk #8758

Merged

7 tasks

lmatz mentioned this issue Mar 28, 2023

save the cost of building a new chunk when sinking #8798

Closed

kwannoel modified the milestones: release-0.19, release-0.20 Apr 24, 2023

kwannoel removed this from the release-1.0 milestone Jul 14, 2023

kwannoel mentioned this issue Jun 12, 2024

perf(sink): don't compact chunk for blackhole sink #17167

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: nexmark q0 #8712

perf: nexmark q0 #8712

kwannoel commented Mar 22, 2023 •

edited by lmatz

Loading

kwannoel commented Mar 22, 2023

lmatz commented Mar 22, 2023

lmatz commented Mar 22, 2023 •

edited

Loading

lmatz commented Mar 22, 2023 •

edited

Loading

lmatz commented Mar 22, 2023 •

edited

Loading

lmatz commented Mar 23, 2023 •

edited

Loading

kwannoel commented Mar 23, 2023 •

edited

Loading

lmatz commented Mar 23, 2023 •

edited

Loading

kwannoel commented Mar 23, 2023 •

edited

Loading

lmatz commented Mar 23, 2023

kwannoel commented Mar 23, 2023

lmatz commented Mar 23, 2023

lmatz commented Mar 23, 2023

kwannoel commented Mar 23, 2023

kwannoel commented Mar 23, 2023

lmatz commented Mar 23, 2023 •

edited

Loading

lmatz commented Mar 23, 2023

st1page commented Mar 24, 2023

kwannoel commented Mar 24, 2023 •

edited

Loading

st1page commented Mar 24, 2023

lmatz commented Mar 28, 2023 •

edited

Loading

lmatz commented May 28, 2024

perf: nexmark q0 #8712

perf: nexmark q0 #8712

Comments

kwannoel commented Mar 22, 2023 • edited by lmatz Loading

Background

kwannoel commented Mar 22, 2023

lmatz commented Mar 22, 2023

lmatz commented Mar 22, 2023 • edited Loading

lmatz commented Mar 22, 2023 • edited Loading

lmatz commented Mar 22, 2023 • edited Loading

lmatz commented Mar 23, 2023 • edited Loading

kwannoel commented Mar 23, 2023 • edited Loading

lmatz commented Mar 23, 2023 • edited Loading

kwannoel commented Mar 23, 2023 • edited Loading

lmatz commented Mar 23, 2023

kwannoel commented Mar 23, 2023

lmatz commented Mar 23, 2023

lmatz commented Mar 23, 2023

kwannoel commented Mar 23, 2023

kwannoel commented Mar 23, 2023

lmatz commented Mar 23, 2023 • edited Loading

lmatz commented Mar 23, 2023

st1page commented Mar 24, 2023

kwannoel commented Mar 24, 2023 • edited Loading

st1page commented Mar 24, 2023

lmatz commented Mar 28, 2023 • edited Loading

lmatz commented May 28, 2024

kwannoel commented Mar 22, 2023 •

edited by lmatz

Loading

lmatz commented Mar 22, 2023 •

edited

Loading

lmatz commented Mar 22, 2023 •

edited

Loading

lmatz commented Mar 22, 2023 •

edited

Loading

lmatz commented Mar 23, 2023 •

edited

Loading

kwannoel commented Mar 23, 2023 •

edited

Loading

lmatz commented Mar 23, 2023 •

edited

Loading

kwannoel commented Mar 23, 2023 •

edited

Loading

lmatz commented Mar 23, 2023 •

edited

Loading

kwannoel commented Mar 24, 2023 •

edited

Loading

lmatz commented Mar 28, 2023 •

edited

Loading