Skip to content

Latest commit

 

History

History
48 lines (25 loc) · 3.08 KB

spark-streaming-streaminglisteners.adoc

File metadata and controls

48 lines (25 loc) · 3.08 KB

Streaming Listeners

Streaming listeners are listeners interested in streaming events like batch submitted, started or completed.

Streaming listeners implement org.apache.spark.streaming.scheduler.StreamingListener listener interface and process StreamingListenerEvent events.

The following streaming listeners are available in Spark Streaming:

StreamingListenerEvent Events

StreamingJobProgressListener

StreamingJobProgressListener is a streaming listener that collects information for StreamingSource and Streaming page in web UI.

Note
It is created while StreamingContext is created and later registered as a StreamingListener and SparkListener when Streaming tab is created.

onBatchSubmitted

For StreamingListenerBatchSubmitted(batchInfo: BatchInfo) events, it stores batchInfo batch information in the internal waitingBatchUIData registry per batch time.

The number of entries in waitingBatchUIData registry contributes to numUnprocessedBatches (together with runningBatchUIData), waitingBatches, and retainedBatches. It is also used to look up the batch data for a batch time (in getBatchUIData).

numUnprocessedBatches, waitingBatches are used in StreamingSource.

Note
waitingBatches and runningBatches are displayed together in Active Batches in Streaming tab in web UI.

onBatchStarted

Caution
FIXME

onBatchCompleted

Caution
FIXME

Retained Batches

retainedBatches are waiting, running, and completed batches that web UI uses to display streaming statistics.

The number of retained batches is controlled by spark.streaming.ui.retainedBatches.