Skip to content

Latest commit

 

History

History
40 lines (21 loc) · 3.47 KB

spark-streaming-settings.adoc

File metadata and controls

40 lines (21 loc) · 3.47 KB

Settings

The following list are the settings used to configure Spark Streaming applications.

Caution
FIXME Describe how to set them in streaming applications.
  • spark.streaming.kafka.maxRetries (default: 1) sets up the number of connection attempts to Kafka brokers.

  • spark.streaming.receiver.writeAheadLog.enable (default: false) controls what ReceivedBlockHandler to use: WriteAheadLogBasedBlockHandler or BlockManagerBasedBlockHandler.

  • spark.streaming.receiver.blockStoreTimeout (default: 30) time in seconds to wait until both writes to a write-ahead log and BlockManager complete successfully.

  • spark.streaming.clock (default: org.apache.spark.util.SystemClock) specifies a fully-qualified class name that extends org.apache.spark.util.Clock to represent time. It is used in JobGenerator.

  • spark.streaming.ui.retainedBatches (default: 1000) controls the number of BatchUIData elements about completed batches in a first-in-first-out (FIFO) queue that are used to display statistics in Streaming page in web UI.

  • spark.streaming.receiverRestartDelay (default: 2000) - the time interval between a receiver is stopped and started again.

  • spark.streaming.concurrentJobs (default: 1) is the number of concurrent jobs, i.e. threads in streaming-job-executor thread pool.

  • spark.streaming.stopSparkContextByDefault (default: true) controls whether (true) or not (false) to stop the underlying SparkContext (regardless of whether this StreamingContext has been started).

  • spark.streaming.kafka.maxRatePerPartition (default: 0) if non-0 sets maximum number of messages per partition.

  • spark.streaming.manualClock.jump (default: 0) offsets (aka jumps) the system time, i.e. adds its value to checkpoint time, when used with the clock being a subclass of org.apache.spark.util.ManualClock. It is used when JobGenerator is restarted from checkpoint.

  • spark.streaming.unpersist (default: true) is a flag to control whether output streams should unpersist old RDDs.

  • spark.streaming.gracefulStopTimeout (default: 10 * batch interval)

  • spark.streaming.stopGracefullyOnShutdown (default: false) controls whether to stop StreamingContext gracefully or not and is used by stopOnShutdown Shutdown Hook.

Checkpointing

Back Pressure