Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 911 Bytes

spark-yarn-yarnscheduler.adoc

File metadata and controls

15 lines (8 loc) · 911 Bytes

YarnScheduler - TaskScheduler for Client Deploy Mode

YarnScheduler is the TaskScheduler for Spark on YARN in client deploy mode.

It is a custom TaskSchedulerImpl with ability to compute racks per hosts, i.e. it comes with a specialized getRackForHost.

It also sets org.apache.hadoop.yarn.util.RackResolver logger to WARN if not set already.

Tracking Racks per Hosts and Ports (getRackForHost method)

getRackForHost attempts to compute the rack for a host.

Note
getRackForHost overrides the parent TaskSchedulerImpl’s getRackForHost

It simply uses Hadoop’s org.apache.hadoop.yarn.util.RackResolver to resolve a hostname to its network location, i.e. a rack.