forked from apache/druid
-
Notifications
You must be signed in to change notification settings - Fork 0
Operators Development Plan
Paul Rogers edited this page Oct 5, 2022
·
5 revisions
-
Basics
- Operator framework: excludes actual query code
- Scan query
-
Timeseries query
-
Other queries
-
Insert physical plan between query runner and operators
-
Convert query runners to a planner:
f(query, metadata) -> physical plan
-
Optimize away the ad-hoc bits: move toward a more typical row format
-
Abstract out the HTTP protocol. Add Gian's new network protocol.
-
Introduce multiple tiers. Requires new server type.
-
Push rework below the
Cursor
level: revisit storage adapters- Adapters assume that Druid will do the work
- Segment adapter negotiates push-down of operations it can handle
- CSV, etc. are simple layers; they don't try to simulate segments
Goal:
- Broaden discussion. (There will likely be much resistance this first go-round.)
- Establish a toehold.
- Make the concept more concrete to other devs.
Tasks:
- Rebase op-step3 on master.
- Fix merge issues.
- Run a test run with operators enabled.
-
DruidUnionRel
- As designed, runs independent queries, then concats results
- Change to have a root segment with a union operator, with child frags (each frag has a separate query)
- Test on Broker, historical pair.
- Ingest data
- Run queries in old mode & verify.
- Run queries in op mode & verify.
- Perhaps find the benchmarks and try those.
- Prototype for group-by query
- Producer/consumer queue for the scatter/gather operator
- Unified row format, perhaps based on frames
- Early prototypes: scan query, timeseries query
- Internal discussions around merit