Skip to content

Latest commit

 

History

History
52 lines (32 loc) · 1.38 KB

spark-sql-spark-plan.adoc

File metadata and controls

52 lines (32 loc) · 1.38 KB

Spark Plan

SparkPlan is an abstract QueryPlan for physical operators, e.g. InMemoryTableScanExec.

Note
Physical operators have their names end with the Exec prefix.

It has the following attributes:

  • metadata

  • metrics

  • outputPartitioning

  • outputOrdering

SparkPlan can be executed (using the final execute method) to compute RDD[InternalRow].

SparkPlan has the following final methods that prepare environment and pass calls on to corresponding methods that constitute SparkPlan Contract:

  • execute calls doExecute

  • prepare calls doPrepare

  • executeBroadcast calls doExecuteBroadcast

SQLMetric

SQLMetric is an accumulator that accumulate and produce long values.

There are three known SQLMetrics:

  • sum

  • size

  • timing

metrics Lookup Table

metrics: Map[String, SQLMetric] = Map.empty

metrics is a private[sql] lookup table of supported SQLMetrics by their names.

SparkPlan Contract

The contract of SparkPlan requires that concrete implementations define the following method:

  • doExecute(): RDD[InternalRow]

They may also define their own custom overrides:

  • doPrepare

  • doExecuteBroadcast

Caution
FIXME Why are there two executes?