Skip to content

Latest commit

 

History

History
26 lines (22 loc) · 1.36 KB

spark-akka-netty.adoc

File metadata and controls

26 lines (22 loc) · 1.36 KB

Spark, Akka and Netty

Spark uses Akka Actor for RPC and messaging, which in turn uses Netty.

Also, for moving bulk data, Netty is used.

  • For shuffle data, Netty can be optionally used. By default, NIO is directly used to do transfer shuffle data.

  • For broadcast data (driver-to-all-worker data transfer), Jetty is used by default.

Tip
Review org.apache.spark.util.AkkaUtils to learn about the various utilities using Akka.
  • sparkMaster is the name of Actor System for the master in Spark Standalone, i.e. akka://sparkMaster is the Akka URL.

  • Akka configuration is for remote actors (via akka.actor.provider = "akka.remote.RemoteActorRefProvider")

  • Enable logging for Akka-related functions in org.apache.spark.util.Utils class at INFO level.

  • Enable logging for RPC messages as DEBUG for org.apache.spark.rpc.akka.AkkaRpcEnv

  • spark.akka.threads (default: 4)

  • spark.akka.batchSize (default: 15)

  • spark.akka.frameSize (default: 128) configures max frame size for Akka messages in bytes

  • spark.akka.logLifecycleEvents (default: false)

  • spark.akka.logAkkaConfig (default: true)

  • spark.akka.heartbeat.pauses (default: 6000s)

  • spark.akka.heartbeat.interval (default: 1000s)

  • Configs starting with akka. in properties file are supported.