HDFSBackedStateStoreProvider
is the default StateStoreProvider (as specified by the spark.sql.streaming.stateStore.providerClass internal configuration property).
HDFSBackedStateStoreProvider
is created and immediately requested to initialize when StateStoreProvider
helper object is requested to create and initialize a StateStoreProvider.
HDFSBackedStateStoreProvider
takes no arguments to be created.
HDFSBackedStateStoreProvider
uses the state checkpoint base directory (that is the storeCheckpointLocation of the StateStoreId) for delta and snapshot state files. The checkpoint directory is created when HDFSBackedStateStoreProvider
is requested to initialize.
Tip
|
Enable Add the following line to
Refer to Logging. |
As a StateStoreProvider, HDFSBackedStateStoreProvider
is associated with a StateStoreId (which is a unique identifier of a state store for an operator and a partition).
HDFSBackedStateStoreProvider
is given the StateStoreId at initialization (as requested by the StateStoreProvider contract).
The StateStoreId is then used for the following:
-
HDFSBackedStateStore
is requested for the id -
HDFSBackedStateStoreProvider
is requested for the textual representation and the state checkpoint base directory
toString(): String
Note
|
toString is part of the java.lang.Object contract for the string representation of the object.
|
HDFSBackedStateStoreProvider
uses the StateStoreId and the state checkpoint base directory for the textual representation:
HDFSStateStoreProvider[id = (op=[operatorId],part=[partitionId]),dir = [baseDir]]
getStore(version: Long): StateStore
Note
|
getStore is part of the StateStoreProvider Contract to get the StateStore for a given version.
|
getStore
…FIXME
deltaFile(version: Long): Path
Note
|
deltaFile is used when…FIXME
|
fetchFiles(): Seq[StoreFile]
fetchFiles
…FIXME
Note
|
fetchFiles is used when HDFSBackedStateStoreProvider is requested to latestIterator, doSnapshot and cleanup.
|
init(
stateStoreId: StateStoreId,
keySchema: StructType,
valueSchema: StructType,
indexOrdinal: Option[Int],
storeConf: StateStoreConf,
hadoopConf: Configuration): Unit
Note
|
init is part of the StateStoreProvider Contract to initialize itself.
|
init
assigns the values of the input arguments to stateStoreId, keySchema, valueSchema, storeConf, and hadoopConf.
init
uses the StateStoreConf
to requests for the spark.sql.streaming.maxBatchesToRetainInMemory configuration property (that is then the numberOfVersionsToRetainInMemory).
In the end, init
requests the CheckpointFileManager to create the baseDir directory (with subdirectories).
latestIterator(): Iterator[UnsafeRowPair]
latestIterator
…FIXME
Note
|
latestIterator seems to be used exclusively in tests.
|
doMaintenance(): Unit
Note
|
doMaintenance is part of the StateStoreProvider Contract to do maintenance if needed.
|
doMaintenance
…FIXME
close(): Unit
Note
|
close is part of the StateStoreProvider Contract to close the state store provider.
|
close
…FIXME
putStateIntoStateCacheMap(
newVersion: Long,
map: ConcurrentHashMap[UnsafeRow, UnsafeRow]): Unit
putStateIntoStateCacheMap
…FIXME
Note
|
putStateIntoStateCacheMap is used when HDFSBackedStateStoreProvider is requested to commitUpdates and loadMap.
|
commitUpdates(
newVersion: Long,
map: ConcurrentHashMap[UnsafeRow, UnsafeRow],
output: DataOutputStream): Unit
commitUpdates
…FIXME
Note
|
commitUpdates is used exclusively when HDFSBackedStateStore is requested to commit state changes.
|
loadMap(version: Long): ConcurrentHashMap[UnsafeRow, UnsafeRow]
loadMap
…FIXME
Note
|
loadMap is used when HDFSBackedStateStoreProvider is requested to retrieve the state store for a specified version and latestIterator.
|
writeSnapshotFile(
version: Long,
map: MapType): Unit
writeSnapshotFile
…FIXME
Note
|
writeSnapshotFile is used when…FIXME
|
updateFromDeltaFile(
version: Long,
map: MapType): Unit
updateFromDeltaFile
…FIXME
Note
|
updateFromDeltaFile is used exclusively when HDFSBackedStateStoreProvider is requested to loadMap.
|
Name | Description |
---|---|
|
loadedMaps: TreeMap[Long, ConcurrentHashMap[UnsafeRow, UnsafeRow]] java.util.TreeMap of FIXME sorted according to the reversed natural ordering of the keys The current size estimation of A new entry (a version and the associated map) is added when Used when…FIXME |
|
numberOfVersionsToRetainInMemory: Int
|