• No results found

Figures 5.5 and 5.6 illustrated the job execution or request/response pipeline and flow. The following description matches the numbering in Figures 5.5 and 5.6, for better

understanding.

FIGURE 5.5 Job execution on flow in YARN—part 1.

FIGURE 5.6 Job execution on flow in YARN—part 2.

individual Node ID so that Resource Manager can identify them) upon restart or when they join the cluster. Through heartbeat signals, they send the Resource Manager information about the counters (cluster resources) they hold (either in use or free). Based on this information, the Resource Manager builds the inventory of cluster resources.

2. To submit a job in YARN, the job client reaches out to the Application Manager component of the Resource Manager (2A) with the necessary specifications. The Resource Manager requests that the Scheduler component allocate the first container so that it can initiate an instance of the application-specific Application Master (2B). At this time, the Application Manager informs the job client of his status (2C). The Application Master is initiated; going forward, the job client can directly talk to the Application Master for job status (2D). For MapReduce applications, the

Application Master is MRAppMaster.

Note: Communication Between the Job Client and the Application Master The instantiated Application Master registers itself with the Resource

Manager so that the Resource Manager can identify it and provide information about it to the job client. Then on the job client, once the information is received from the Resource Manager, the job client can directly communicate with its own Application Master.

3. The Application Master requests cluster resources, defined in terms of containers, from the Scheduler component of the Resource Manager (3A). The Scheduler responds with allocated resources needed for job execution (3B). This does not necessarily have to happen at the beginning of job execution: During execution, the Application Master can request additional containers if there is a need or can release a container when its use is complete.

While allocating computing resources or containers, the Scheduler takes into account data locality information (hosts and racks) that was computed by

InputFormat and stored inside InputSplits on HDFS. It tries to allocate a container on the same data node (on which the data exists); if it does not find any, the Scheduler looks for another data node in the same rack.

4. The Application Master reaches out to the Node Manager(s) to execute the tasks in the allocated containers. For a MapReduce job, the MRAppMaster application master retrieves InputSplits (created by job client and copied to HDFS) and then creates map tasks based on the number of input splits and reduce tasks based on the mapreduce.job.reduces property. Based on

mapreduce.job.ubertask.enable property and the size of the MapReduce job, the Application Master (MRAppMaster) either executes the MapReduce job in the same JVM or container (as of the Application Master) or reaches out to the Node Manager(s) for job execution.

5. The Node Manager executes the containers. This involves a Java process with

required resources (such as configuration or JAR files) locally and executes the map or reduce tasks.

The Node Manager kicks off the tasks and reports the resource utilization back to the Resource Managers (step 1 in Figure 5.5). When a task completes, the allocated container is released and the Resource Manager is informed so that the container can be reused for some other purpose.

6. The Application Master tracks the execution of the tasks and containers, monitors progress, and reruns the tasks if any failures occur. Tasks executed in the containers report their status to the parent Application Master, where the Application Master accumulates and aggregates status information and execution metrics.

7. The Application Master keeps updating the job client (directly, not via a Resource Manager) about the overall progress of the job execution. The job client polls every second for the status update from an Application Master and poll interval can be configured with the

mapreduce.client.progressmonitor.pollinterval property. Unlike classic MapReduce, where TaskTracker just keeps on passing status information to JobTracker and JobTracker has responsibilities to accumulate and aggregate the information to assemble the current status of the job, in case of YARN the

Application Master accumulates and aggregates the information to assemble current status of the job and client query it from the Application Master (without going through the Resource Manager).

Note: Communication Between the Job Client and Application Master after Application Master Failure

When the Resource Manager instantiates an Application Master during job initialization, it passes the Application Master’s address to the job client. The job client then caches the address and reaches out directly to the Application Master for a job progress update. This way, after job initialization, the client bypasses the Resource Manager (reducing the overall overload on the

Resource Manager) and gets a job status update directly from the Application Master. But what happens when the Application Master fails? How does a job client know about the new Application Master instance and its new address? The job client keeps communicating to the Application Master on the cached address (which it got from the Resource Manager). Upon Application Master failure, the job client encounters the timeout exception while trying to reach out to the Application Master. Then it reaches out again to the Resource Manager to find out the address for the new instance of the Application Master. At this point, it discards the previous cached address and caches the new address for the new instance of the Application Master; from then on, it directly communicates to this new instance of the Application Master.