(2)YARN的工做流程

Writing YARN Applications

文档中的启动过程:html

Application submission client向Yarn ResourceManager提交一个Application,RM、NM、AM处理流程 apache

首先,建立一个YarnClient对象并start它,而后Client能够设置ApplicationContext。为app准备第一个containercontain ApplicationMaster,而后提交Application。app

RM在已经指定的Container中启动ApplicationMaster。AM与YARN集群通讯,处理Application的执行。在app启动的过程当中(app的启动过程当中,AM与RM的通讯是异步的),AM的主要工做包括:异步

(1)与RM通讯,协商为以后的Containers分配资源(经过AMRMClientAsync对象,AMRMClientAsync.CallbackHandler指定事件的处理方法);async

(2)Container分配以后,与NodeManagers通讯,启动它们所在节点的app的Containers(启动一个Runnable对象,当为Containers分配资源以后,启动containers。做为启动Container的一部分,AM须要指定带有启动信息的ContainerLaunchContext)。ide

在Application执行的过程当中,ApplicationMaster经过NMClientAsync对象与NodeManagers通讯。全部Containers的事件由NMClientAsync.CallbackHandler处理。oop

一个callback handler处理Client的start,stop,status update以及error。ui

 

 

(1) 建立一个YarnClient对象并start它,而后Client能够设置ApplicationContext,而后向ResourceManager提交Application。this

(2)RM向NM发出指令,为该App启动第一个Container,并在其中启动ApplicationMasterspa

(3)AM向RM注册

(4)AM采用轮询的方式向RM的YARN Scheduler申请资源

(5)当AM申请到资源后(即获取到了空闲节点的信息),与NodeManagers通讯(多个NodeManager),请求启动计算任务

(6)NodeManagers根据资源量的大小、所需的运行环境,在Container中启动任务。

(7)各个任务向AM汇报本身的状态和进度,以便AM掌握各个任务的执行状况

(8)APP运行完成后,AM向RM注销并关闭本身。

 

 

 

 

 

原文:

The general concept is that an application submission client submits an application to the YARN ResourceManager (RM). This can be done through setting up a YarnClient object. After YarnClient is started, the client can then set up application context, prepare the very first container of the application that contains the ApplicationMaster (AM), and then submit the application. You need to provide information such as the details about the local files/jars that need to be available for your application to run, the actual command that needs to be executed (with the necessary command line arguments), any OS environment settings (optional), etc. Effectively, you need to describe the Unix process(es) that needs to be launched for your ApplicationMaster.

The YARN ResourceManager will then launch the ApplicationMaster (as specified) on an allocated container. The ApplicationMaster communicates with YARN cluster, and handles application execution. It performs operations in an asynchronous fashion. During application launch time, the main tasks of the ApplicationMaster are: a) communicating with the ResourceManager to negotiate and allocate resources for future containers, and b) after container allocation, communicating YARN *NodeManager*s (NMs) to launch application containers on them. Task a) can be performed asynchronously through an AMRMClientAsync object, with event handling methods specified in a AMRMClientAsync.CallbackHandler type of event handler. The event handler needs to be set to the client explicitly. Task b) can be performed by launching a runnable object that then launches containers when there are containers allocated. As part of launching this container, the AM has to specify the ContainerLaunchContext that has the launch information such as command line specification, environment, etc.

参考:

(1)《Hadoop The Definitive Guide 4th》

(2)http://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html

相关文章
相关标签/搜索