V2 Multi-Stage Query Engine

Overview

The new multi-stage query engine (a.k.a V2 query engine) is designed to support more complex SQL semantics such as JOIN, OVER window, MATCH_RECOGNIZE and eventually, make Pinot support closer to full ANSI SQL semantics.

Scatter-Gather Query Engine

Multi-Stage Query Engine

It also resolves the bottleneck effect for the broker reduce stage where only a single machine is dedicated to perform heavy lifting such as high cardinality GROUP BY result merging; ORDER BY sorting, etc.

How to use the V2 query engine

To enable the V2 engine,

please make sure to either
- Building Apache Pinot using the latest master commit.
- Download the latest Apache Pinot docker image using the official guide.

Please add the following configurations to your cluster config:

"pinot.multistage.engine.enabled": "true",
"pinot.server.instance.currentDataTableVersion": "4",
"pinot.query.server.port": "8421",
"pinot.query.runner.port": "8442"

Start the cluster normally, you should see the following window in the controller query page:

Sample Query Screenshot

Design Details

The overall PEP design doc and discussion can be found in the following links

PEP discussion Github Issue and
PEP design doc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2-multi-stage-query-engine.md

v2-multi-stage-query-engine.md

V2 Multi-Stage Query Engine

Overview

How to use the V2 query engine

Design Details

Files

v2-multi-stage-query-engine.md

Latest commit

History

v2-multi-stage-query-engine.md

File metadata and controls

V2 Multi-Stage Query Engine

Overview

How to use the V2 query engine

Design Details