CoarseGrainedExecutorBackend
manages a single executor object. The internal executor
object is created after a connection to the driver is established (i.e. after RegisteredExecutor has arrived).
CoarseGrainedExecutorBackend
is an executor backend for coarse-grained executors that live until the executor backend terminates.
CoarseGrainedExecutorBackend
registers itself as a RPC Endpoint under the name Executor.
When started it connects to driverUrl
(given as an option on command line), i.e. CoarseGrainedSchedulerBackend, for tasks to run.
Caution
|
What are RegisterExecutor and RegisterExecutorResponse ? Why does CoarseGrainedExecutorBackend send it in onStart ?
|
When it cannot connect to driverUrl
, it terminates (with the exit code 1
).
Caution
|
What are SPARK_LOG_URL_ env vars? Who sets them?
|
When the driver terminates, CoarseGrainedExecutorBackend
exits (with exit code 1
).
ERROR Driver [remoteAddress] disassociated! Shutting down.
All task status updates are sent along to driverRef
as StatusUpdate
messages.
Tip
|
Enable Add the following line to
|
Note
|
onStart is a RpcEndpoint callback method that is executed before a RPC endpoint starts to handle messages.
|
When onStart
is executed, it prints out the following INFO message to the logs:
INFO CoarseGrainedExecutorBackend: Connecting to driver: [driverUrl]
It then accesses the RpcEndpointRef for the driver (using the constructor’s driverUrl) and eventually initializes the internal driver that it will send a blocking RegisterExecutor
message to.
If there is an issue while registering the executor, you should see the following ERROR message in the logs and process exits (with the exit code 1
).
ERROR Cannot register with driver: [driverUrl]
Note
|
The RegisterExecutor message contains executorId , the RpcEndpointRef to itself, cores , and log URLs.
|
driver
is an optional RpcEndpointRef for the driver.
Tip
|
See Starting RpcEndpoint (onStart method) to learn how it is initialized. |
The driver’s URL is of the format spark://[RpcEndpoint name]@[hostname]:[port]
, e.g. spark://[email protected]:64859
.
CoarseGrainedExecutorBackend is a command-line application (it comes with main
method).
It accepts the following options:
-
--driver-url
(required) - the driver’s URL. See driver’s URL.
-
--executor-id
(required) - the executor’s id -
--hostname
(required) - the name of the host -
--cores
(required) - the number of cores (must be more than0
) -
--app-id
(required) - the id of the application -
--worker-url
- the worker’s URL, e.g.spark://[email protected]:64557
-
--user-class-path
- a URL/path to a resource to be added to CLASSPATH; can be specified multiple times.
Unrecognized options or required options missing cause displaying usage help and exit.
$ ./bin/spark-class org.apache.spark.executor.CoarseGrainedExecutorBackend
Usage: CoarseGrainedExecutorBackend [options]
Options are:
--driver-url <driverUrl>
--executor-id <executorId>
--hostname <hostname>
--cores <cores>
--app-id <appid>
--worker-url <workerUrl>
--user-class-path <url>
It first fetches Spark properties from CoarseGrainedSchedulerBackend (using the driverPropsFetcher
RPC Environment and the endpoint reference given in driver’s URL).
For this, it creates SparkConf
, reads spark.executor.port
setting (defaults to 0
) and creates the driverPropsFetcher
RPC Environment in client mode. The RPC environment is used to resolve the driver’s endpoint to post RetrieveSparkProps
message.
It sends a (blocking) RetrieveSparkProps
message to the driver (using the value for driverUrl
command-line option). When the response (the driver’s SparkConf
) arrives it adds spark.app.id
(using the value for appid
command-line option) and creates a brand new SparkConf
.
If spark.yarn.credentials.file
is set, …FIXME
A SparkEnv is created using SparkEnv.createExecutorEnv (with isLocal
being false
).
Caution
|
FIXME |
Caution
|
FIXME Where is org.apache.spark.executor.CoarseGrainedExecutorBackend used?
|
It is used in:
-
SparkDeploySchedulerBackend
-
CoarseMesosSchedulerBackend
-
SparkClassCommandBuilder
- ???
RegisteredExecutor(hostname)
When a RegisteredExecutor
message arrives, you should see the following INFO in the logs:
INFO CoarseGrainedExecutorBackend: Successfully registered with driver
The internal executor is created using executorId
constructor parameter, with hostname
that has arrived and others.
Note
|
The message is sent after CoarseGrainedSchedulerBackend handles a RegisterExecutor message.
|
RegisterExecutorFailed(message)
When a RegisterExecutorFailed
message arrives, the following ERROR is printed out to the logs:
ERROR CoarseGrainedExecutorBackend: Slave registration failed: [message]
CoarseGrainedExecutorBackend
then exits with the exit code 1
.
LaunchTask(data: SerializableBuffer)
The LaunchTask
handler deserializes TaskDescription
from data
(using the global closure Serializer).
Note
|
LaunchTask message is sent by CoarseGrainedSchedulerBackend.launchTasks.
|
INFO CoarseGrainedExecutorBackend: Got assigned task [taskId]
It then launches the task on the executor (using Executor.launchTask method).
If however the internal executor
field has not been created yet, it prints out the following ERROR to the logs:
ERROR CoarseGrainedExecutorBackend: Received LaunchTask command but executor was null
And it then exits.
KillTask(taskId, _, interruptThread)
message kills a task (calls Executor.killTask
).
If an executor has not been initialized yet (FIXME: why?), the following ERROR message is printed out to the logs and CoarseGrainedExecutorBackend exits:
ERROR Received KillTask command but executor was null
StopExecutor
message handler is receive-reply and blocking. When received, the handler prints the following INFO message to the logs:
INFO CoarseGrainedExecutorBackend: Driver commanded a shutdown
It then sends a Shutdown
message to itself.