Skip to content

Latest commit

 

History

History
110 lines (71 loc) · 4.56 KB

spark-sql-streaming-StreamingAggregationStateManagerImplV2.adoc

File metadata and controls

110 lines (71 loc) · 4.56 KB

StreamingAggregationStateManagerImplV2 — Default State Manager for Streaming Aggregation

StreamingAggregationStateManagerImplV2 is the default state manager for streaming aggregations.

Note
The version of a state manager is controlled using spark.sql.streaming.aggregation.stateFormatVersion internal configuration property.

StreamingAggregationStateManagerImplV2 is created exclusively when StreamingAggregationStateManager is requested for a new StreamingAggregationStateManager.

StreamingAggregationStateManagerImplV2 (like the parent StreamingAggregationStateManagerBaseImpl) takes the following to be created:

  • Catalyst expressions for the keys (Seq[Attribute])

  • Catalyst expressions for the input rows (Seq[Attribute])

Storing Row in State Store — put Method

put(store: StateStore, row: UnsafeRow): Unit
Note
put is part of the StreamingAggregationStateManager Contract to store a row in a state store.

put…​FIXME

Getting Saved State for Non-Null Key from State Store — get Method

get(store: StateStore, key: UnsafeRow): UnsafeRow
Note
get is part of the StreamingAggregationStateManager Contract to get the saved state for a given non-null key from a given state store.

get requests the given StateStore for the current state value for the given key.

get returns null if the key could not be found in the state store. Otherwise, get restoreOriginalRow (for the key and the saved state).

restoreOriginalRow Internal Method

restoreOriginalRow(key: UnsafeRow, value: UnsafeRow): UnsafeRow
restoreOriginalRow(rowPair: UnsafeRowPair): UnsafeRow

restoreOriginalRow…​FIXME

Note
restoreOriginalRow is used when StreamingAggregationStateManagerImplV2 is requested to get the saved state for a given non-null key from a state store, iterator and values.

getStateValueSchema Method

getStateValueSchema: StructType
Note
getStateValueSchema is part of the StreamingAggregationStateManager Contract to…​FIXME.

getStateValueSchema simply requests the valueExpressions for the schema.

iterator Method

iterator: iterator(store: StateStore): Iterator[UnsafeRowPair]
Note
iterator is part of the StreamingAggregationStateManager Contract to…​FIXME.

iterator simply requests the input state store for the iterator that is mapped to an iterator of UnsafeRowPairs with the key (of the input UnsafeRowPair) and the value as a restored original row.

Note
scala.collection.Iterator is a data structure that allows to iterate over a sequence of elements that are usually fetched lazily (i.e. no elements are fetched from the underlying store until processed).

values Method

values(store: StateStore): Iterator[UnsafeRow]
Note
values is part of the StreamingAggregationStateManager Contract to…​FIXME.

values…​FIXME

Internal Properties

Name Description

joiner

keyValueJoinedExpressions

needToProjectToRestoreValue

restoreValueProjector

valueExpressions

valueProjector