Azkaban是由Linkedin开源的一个批量工做流任务调度器。用于在一个工做流内以一个特定的顺序运行一组工做和流程。Azkaban定义了一种KV文件格式来创建任务之间的依赖关系,并提供一个易于使用的web用户界面维护和跟踪你的工做流。css
azkaban-web-server-2.5.0.tar.gz
azkaban-executor-server-2.5.0.tar.gz
azkaban-sql-script-2.5.0.tar.gz
mysql
其中,azkaban-web-server-2.5.0.tar.gz
是服务器,azkaban-executor-server-2.5.0.tar.gz
是执行服务器,azkaban-sql-script-2.5.0.tar.gz
是执行的sql脚本。web
分别把他们解压安装后。咱们还须要在mysql中建立数据库,而后运行azkaban提供的sql脚原本建立azkaban所须要的表。sql
mysql -uroot -p
mysql> create database azkaban;
mysql> use azkaban;
Database changed
mysql> source /home/fantj/azkaban/azkaban-2.5.0/create-all-sql-2.5.0.sql;
mysql> show tables;
+------------------------+
| Tables_in_azkaban |
+------------------------+
| active_executing_flows |
| active_sla |
| execution_flows |
| execution_jobs |
| execution_logs |
| project_events |
| project_files |
| project_flows |
| project_permissions |
| project_properties |
| project_versions |
| projects |
| properties |
| schedules |
| triggers |
+------------------------+
15 rows in set (0.00 sec)
复制代码
keytool -keystore keystore -alias jetty -genkey -keyalg RSA
会在当前目录生成一个keystore
证书文件,固然执行该命令须要你填写一些信息,好比你的姓名+工做单位等。按照提示填写便可。[root@s166 azkaban]# tzselect
Please identify a location so that time zone rules can be set correctly.
Please select a continent or ocean.
1) Africa
2) Americas
3) Antarctica
4) Arctic Ocean
5) Asia
6) Atlantic Ocean
7) Australia
8) Europe
9) Indian Ocean
10) Pacific Ocean
11) none - I want to specify the time zone using the Posix TZ format.
#? 5
Please select a country.
1) Afghanistan 18) Israel 35) Palestine
2) Armenia 19) Japan 36) Philippines
3) Azerbaijan 20) Jordan 37) Qatar
4) Bahrain 21) Kazakhstan 38) Russia
5) Bangladesh 22) Korea (North) 39) Saudi Arabia
6) Bhutan 23) Korea (South) 40) Singapore
7) Brunei 24) Kuwait 41) Sri Lanka
8) Cambodia 25) Kyrgyzstan 42) Syria
9) China 26) Laos 43) Taiwan
10) Cyprus 27) Lebanon 44) Tajikistan
11) East Timor 28) Macau 45) Thailand
12) Georgia 29) Malaysia 46) Turkmenistan
13) Hong Kong 30) Mongolia 47) United Arab Emirates
14) India 31) Myanmar (Burma) 48) Uzbekistan
15) Indonesia 32) Nepal 49) Vietnam
16) Iran 33) Oman 50) Yemen
17) Iraq 34) Pakistan
#? 9
Please select one of the following time zone regions.
1) Beijing Time
2) Xinjiang Time
#? 1
The following information has been given:
China
Beijing Time
Therefore TZ='Asia/Shanghai' will be used.
Local time is now: Sat Jul 28 18:29:58 CST 2018.
Universal Time is now: Sat Jul 28 10:29:58 UTC 2018.
Is the above information OK?
1) Yes
2) No
#? 1
You can make this change permanent for yourself by appending the line
TZ='Asia/Shanghai'; export TZ
to the file '.profile' in your home directory; then log out and log in again.
Here is that TZ value again, this time on standard output so that you
can use the /usr/bin/tzselect command in shell scripts:
Asia/Shanghai
复制代码
这个配置须要给集群的每一个主机设置,由于任务调度离不开准确的时间。咱们也能够直接把相关文件拷贝到别的主机做覆盖。shell
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
复制代码
[root@s166 azkaban]# scp /usr/share/zoneinfo/Asia/Shanghai root@s168:/etc/localtime
Shanghai 100% 388 500.8KB/s 00:00
[root@s166 azkaban]# scp /usr/share/zoneinfo/Asia/Shanghai root@s169:/etc/localtime
Shanghai
复制代码
/webserver/conf
目录下的azkaban.properties
(我以前将服务端的解压文件更名为webserver)#Azkaban Personalization Settings
azkaban.name=Test
azkaban.label=My Local Azkaban
azkaban.color=#FF3601
azkaban.default.servlet.path=/index
web.resource.dir=web/
default.timezone.id=Asia/Shanghai
#Azkaban UserManager class
user.manager.class=azkaban.user.XmlUserManager
user.manager.xml.file=conf/azkaban-users.xml
#Loader for projects
executor.global.properties=conf/global.properties
azkaban.project.dir=projects
database.type=mysql
mysql.port=3306
mysql.host=localhost
mysql.database=azkaban
mysql.user=root
mysql.password=root
mysql.numconnections=100
# Velocity dev mode
velocity.dev.mode=false
# Azkaban Jetty server properties.
jetty.maxThreads=25
jetty.ssl.port=8443
jetty.port=8081
jetty.keystore=keystore
jetty.password=jiaoroot
jetty.keypassword=jiaoroot
jetty.truststore=keystore
jetty.trustpassword=jiaoroot
# Azkaban Executor settings
executor.port=12321
# mail settings
mail.sender=844072586@qq.com
mail.host=smtp.qq.com
job.failure.email=
job.success.email=
lockdown.create.projects=false
cache.directory=cache
复制代码
主要修改时区+mysql配置+SSL密码和文件路径+邮箱配置。不贴注释了,一看就懂。数据库
/conf/
目录下的azkaban-users.xml
<azkaban-users>
<user username="azkaban" password="azkaban" roles="admin" groups="azkaban" />
<user username="metrics" password="metrics" roles="metrics"/>
<user username="admin" password="admin" roles="admin">
<role name="admin" permissions="ADMIN" />
<role name="metrics" permissions="METRICS"/>
</azkaban-users>
复制代码
修改/executor/conf
目录下的azkaban.properties
vim
#Azkaban
default.timezone.id=Asia/Shanghai
# Azkaban JobTypes Plugins
azkaban.jobtype.plugin.dir=plugins/jobtypes
#Loader for projects
executor.global.properties=conf/global.properties
azkaban.project.dir=projects
database.type=mysql
mysql.port=3306
mysql.host=localhost
mysql.database=azkaban
mysql.user=root
mysql.password=root
mysql.numconnections=100
# Azkaban Executor settings
executor.maxThreads=50
executor.port=12321
executor.flow.threads=30
复制代码
在webserver/bin
目录下,执行[root@s166 webserver]# nohup bin/azkaban-web-start.sh 1>/tmp/azstd.out 2>/tmp/azerr.out &
启动服务。浏览器
小技巧:先别记着用nohup执行,否则报错不可以及时的反馈,应该在尝试执行经过后再去尝试用nohup来执行。[root@s166 executor]# bin/azkaban-executor-start.sh
bash
我大概见到的一些报错是:服务器
在/executor/bin/
目录下执行[root@s166 webserver]# bin/azkaban-web-start.sh
https://s166:8443/
若是你看到这样的画面,证实你错了,没有在根目录下执行,而是习惯性的在bin目录下执行启动文件,因此它的不少css都加载不到。
用设置的帐号密码登陆。
vim command.job
#command.job
type=command
command=echo fantj666
复制代码
将job资源文件打包成zip文件 zip command.job
经过azkaban的web管理平台建立project并上传job压缩包 首先建立project
# foo.job
type=command
command=echo foo
复制代码
第二个job:bar.job依赖foo.job
# bar.job
type=command
dependencies=foo
command=echo bar
复制代码
vim fs.job
# fs.job
type=command
command=/home/fantj/hadoop/bin/hadoop fs -lsr /
复制代码
hive脚本test.sql
use default;
drop table aztest;
create table aztest(id int,name string,age int) row format delimited fields terminated by ',' ;
load data inpath '/aztest/hiveinput' into table aztest;
create table azres as select * from aztest;
insert overwrite directory '/aztest/hiveoutput' select count(1) from aztest;
复制代码
job文件hivef.job
# hivef.job
type=command
command=/home/fantj/hive/bin/hive -f 'test.sql'
复制代码
打zip包-上传-执行-查log