Pool
is a Schedulable entity that represents a tree of TaskSetManagers, i.e. it contains a collection of TaskSetManagers
or the Pools
thereof.
A Pool
has a mandatory name, a scheduling mode, initial minShare
and weight
that are defined when it is created.
Note
|
An instance of Pool is created when TaskSchedulerImpl is initialized.
|
Note
|
The TaskScheduler Contract and Schedulable Contract both require that their entities have rootPool of type Pool .
|
Using the scheduling mode (given when a Pool
object is created), Pool
selects SchedulingAlgorithm and sets taskSetSchedulingAlgorithm
:
-
FIFOSchedulingAlgorithm for FIFO scheduling mode.
-
FairSchedulingAlgorithm for FAIR scheduling mode.
It throws an IllegalArgumentException
when unsupported scheduling mode is passed on:
Unsupported spark.scheduler.mode: [schedulingMode]
Tip
|
Read about the scheduling modes in SchedulingMode. |
Note
|
taskSetSchedulingAlgorithm is used in getSortedTaskSetQueue.
|
Note
|
addSchedulable is part of the Schedulable Contract.
|
addSchedulable
adds a Schedulable
to the schedulableQueue and schedulableNameToSchedulable.
More importantly, it sets the Schedulable
entity’s parent to itself.
Note
|
getSortedTaskSetQueue is part of the Schedulable Contract.
|
getSortedTaskSetQueue
sorts all the Schedulables in schedulableQueue queue by a SchedulingAlgorithm (from the internal taskSetSchedulingAlgorithm).
Note
|
It is called when TaskSchedulerImpl processes executor resource offers.
|
schedulableNameToSchedulable = new ConcurrentHashMap[String, Schedulable]
schedulableNameToSchedulable
is a lookup table of Schedulable objects by their names.
Beside the obvious usage in the housekeeping methods like addSchedulable
, removeSchedulable
, getSchedulableByName
from the Schedulable Contract, it is exclusively used in SparkContext.getPoolForName.
SchedulingAlgorithm
is the interface for a sorting algorithm to sort Schedulables.
There are currently two SchedulingAlgorithms
:
-
FIFOSchedulingAlgorithm for FIFO scheduling mode.
-
FairSchedulingAlgorithm for FAIR scheduling mode.
FIFOSchedulingAlgorithm
is a scheduling algorithm that compares Schedulables
by their priority
first and, when equal, by their stageId
.
Note
|
priority and stageId are part of Schedulable Contract.
|
Caution
|
FIXME A picture is worth a thousand words. How to picture the algorithm? |
FairSchedulingAlgorithm
is a scheduling algorithm that compares Schedulables
by their minShare
, runningTasks
, and weight
.
Note
|
minShare , runningTasks , and weight are part of Schedulable Contract.
|
For each input Schedulable
, minShareRatio
is computed as runningTasks
by minShare
(but at least 1
) while taskToWeightRatio
is runningTasks
by weight
.