Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking: setup micro benchmark for stream executors with in-memory state #5678

Open
18 tasks
lmatz opened this issue Oct 1, 2022 · 6 comments
Open
18 tasks
Labels
component/streaming Stream processing related issue. type/feature type/perf type/tracking Tracking issue.

Comments

@lmatz
Copy link
Contributor

lmatz commented Oct 1, 2022

As mentioned in #5227, the purpose is to measure the pure computing performance, and a prerequisite is an efficient in-memory store.

As mentioned in #5227 (comment) by @BugenZhao , we now don't need to wait for a new in-memory store, unless there shows a performance bottleneck on the current in-memory store implementation.

  • stream: Setup benchmark for HashAggExecutor #5683
  • stream: Setup benchmark for HashJoinExecutor
  • stream: Setup benchmark for LocalSimpleAggExecutor
  • stream: Setup benchmark for GlobalSimpleAggExecutor
  • stream: Setup benchmark for DynamicFilterExecutor
  • stream: Setup benchmark for DispatchExecutor
  • stream: Setup benchmark for FilterExecutor
  • stream: Setup benchmark for ProjectSetExecutor
  • stream: Setup benchmark for ExpandExecutor
  • stream: Setup benchmark for BatchQueryExecutor
  • stream: Setup benchmark for ChainExecutor
  • stream: Setup benchmark for RearrangedChainExecutor
  • stream: Setup benchmark for HopWindowExecutor
  • stream: Setup benchmark for LookupExecutor
  • stream: Setup benchmark for LookupUnionExecutor
  • stream: Setup benchmark for UnionExecutor
  • stream: Setup benchmark for SourceExecutor
  • stream: Setup benchmark for MaterializeExecutor

Depends on #6285

@lmatz lmatz added type/feature type/tracking Tracking issue. component/streaming Stream processing related issue. labels Oct 1, 2022
@github-actions github-actions bot added this to the release-0.1.14 milestone Oct 1, 2022
@lmatz lmatz added the type/perf label Oct 1, 2022
@jon-chuang
Copy link
Contributor

jon-chuang commented Oct 10, 2022

To my understanding, the current impl is a per-task store:

inner: Arc<RwLock<BTreeMap<KeyWithEpoch, Option<Bytes>>>>,
which is probably as good as one can do. But we won't be able to scale up/down or test recovery with this impl, but its not our objective.

Or, are we using shared version of MemoryStateStore? This would be bad as it would then be single lock.

So, to confirm, we are using per-task store for purposes of bench, correct?

@BugenZhao
Copy link
Member

The shared one is a singleton, which can be used to simulate shared storage when running multiple compute nodes in a single process with risedev p.

For other cases, we're using the one constructed here whose lifetime is the same as compute_node_serve and is shared by all executors in this compute node.

let state_store = StateStoreImpl::new(

@jon-chuang
Copy link
Contributor

😥 I see, it would be nice if we can create one that is spawned on a per-thread or per task basis, so we don't have to worry about contention at all... To my understanding, that is the objective of these in-memory benchmarks? I.e. to test purely the compute performance up to data serialization/deserialization?

@BugenZhao
Copy link
Member

If we bench a single operator with the style of integrated tests in a single thread, there'll also be no contention. IIRC, the storage team is working on a refactoring of the local state store, we may check if we can improve it then.

@fuyufjh fuyufjh removed this from the release-0.1.16 milestone Jan 30, 2023
@fuyufjh
Copy link
Member

fuyufjh commented Jan 30, 2023

Removed from the milestone. Do it later.

@kwannoel
Copy link
Contributor

kwannoel commented Feb 8, 2023

Will this be priority again? Since we are looking at performance of stream engine?
Performance dashboard runs daily, whereas these benchmarks can be easily run ad-hoc to see if certain optimizations work or not.
Additionally we can generate flamegraph and see mem and cpu cost centres.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/streaming Stream processing related issue. type/feature type/perf type/tracking Tracking issue.
Projects
None yet
Development

No branches or pull requests

5 participants