Skip to content

Operators Early Investigation

Paul Rogers edited this page Sep 4, 2022 · 1 revision

Possible Components

  • Physical operators (just called "operators") that do actual execution.
  • Operator definitions that describe what is needed, can be optimized, and can be sent across nodes. (Called "physical operators" in Drill, but that is a bit of a misnomer.)
  • Logical operators (in Calcite.)
  • Planner: takes a Druid query (for now) and system description and produces an execution plan in the form of operator definitions.
  • Executor which takes an operator definition subtree, converts it to operators, end executes the result.

Challenges

  • Incremental approach: replace existing stuff step by step.
  • Parallel approach: over time, the new approach parallels the existing one.
  • Rejigger current code to fit the revised model. Hard given the complexity of that code.

Approach

Inside-out approach: find a "loose thread" to pull on. Create a parallel track for that thread. (For example, replace query runners.) Over time, connect the "islands" of functionality to create a larger parallel structure. At that point, cut out the earlier temporary bits, leaving the original structure unchanged. Work toward the top. At some point, there will be a top-level switch: go down the new path or the old one.

The result is a new "engine", given by a context option. The old engine exists, and is (at first) the default. Also needed to allow extension queries. A context flag decides which path to take.

Query runners are split into two parts. The planning portion goes into the planner. (Keep modular structure somehow?) The execution part goes into operators.

Architecture

  • Broker: planner takes a query, a segment walker and a cluster definition. Emits an exec plan.
  • Broker: engine takes an exec plan, executes, and returns results.
  • Historical: new endpoint which takes a fragment, executes, returns results.

Stages

  • Identify existing tests that exercise each query runner code path. If missing, add them.
    • May require a native query test framework like the one for SQL queries.
    • Refactor distribution support to make test path a bit less of a hack?
  • Most query runners create operators in place of sequences. (In progress). The exception are the distribution items.
  • Design for the distribution portion of a query. Local for now, part of plan later.
  • Replace rest of Broker query runners with operators.
  • Portion of query runner stack replaced with a planner. (As in earlier prototype).
  • Create a physical plan. Above planner translates a native query to a physical plan. Physical plan generates operators.
  • Replace top-level thing in Historical with planner. Keep existing API.
  • Introduce revised Historical API which accepts a fragment physical plan, returns results.
    • May require single-process for testing.
  • Broker thing which does scatter/gather replaced with operator which calls new API.
  • Replace Broker "planner" for single queries.
  • Understand sub-query, join, union support. Evolve to use planner and broker.
  • Broker produces complete query physical plan, divided into fragments. Run as today.
  • Revise fragment plan structure to allow intermediate stages.
    • Requires that Brokers and/or Historicals become execution nodes.

Steps

Acknowledges prototypes already done.

PR with Parallel Operator Subset

Goal: for native queries in tests (and possibly the Broker and Historical), have query runners use operators for execution. Only the operator execs, no planning part. Path selected by a context flag.

  • Create operator package from op-step1. Ensure tests work.
  • Create config. Use a system property and config project to globally enable the feature.
    • Connect to a Guice-driven config to allow setting config via properties.
    • How to pass that config down into the execution layer?
  • Per-query check if path is enabled.

Code structure in processing:

  • queryng
    • operators
    • defns
    • planner

Integration Points

ResponseContext is the only state passed down through the query runners. Attach FragmentContext to that object. (But, what happens when the query splits off to multiple threads?)

Configuration is via a super-simple QueryNGConfig class with one setting: to enable the engine. The config is set up via Guice in the usual way.

Operators need a fragment context. The current code creates one each time, which is awkward. To work around this:

  • The code which creates the response context will attach a fragment context if the NG engine is enabled.
  • Checking for enablement is then a simple matter of asking the response context.
  • (Check if response contexts are created elsewhere. Figure out a solution if so.)

Longer term, provide a "root operator" which manages the fragment context and provides results. Add this as an interface instead of/in addition to, the Sequence-based one.

To Do

  • ScanQueryRunnerFactory.mergeRunners() converted to ScanPlanner.runMerge(): handle other cases. Seems to only handle "normal strategy" at the moment. How to test the others?
Clone this wiki locally