- 官网给出的介绍
Oozie is a workflow scheduler system to manage Apache Hadoop jobs.
Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.
Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time (frequency) and data availability.
Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).
Oozie is a scalable, reliable and extensible system.
Developers interested in getting more involved with Oozie may join the mailing lists, report bugs, retrieve code from the version control system, and make contributions.
一、是什么
平时可用来处理定时任务,可以串行执行shell脚本,hive脚本,sqoop脚本等
二、怎么么用
老规矩,用之前还得知道是怎么配置的;
配进行下面的操作前,保证hdfs启动了,Maven安装好了,MySQL启动了
配置的步骤可分为编译前的操作,和编译后的操作,前一步的操作在你安装的目录下进行,第二次操作在你编译后生成的目录下操作。
编译前的操作
- 下载安装包
http://archive.apache.org/dist/oozie/5.2.0/
查看解压后的oozie安装目录下都有哪些文件,我的安装位置在/opt/oozie-5.2.0
下
[root@wq1 oozie-5.2.0]# ll
总用量 340
drwxr-xr-x. 2 502 wheel 4096 3月 30 20:41 bin
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 client
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 core
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 distro
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 docs
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 examples
drwxr-xr-x. 4 502 wheel 68 3月 30 20:41 fluent-job
drwxr-xr-x. 2 root root 8192 3月 30 21:02 libext
-rw-r--r--. 1 502 wheel 37664 11月 7 2019 LICENSE.txt
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 minitest
-rw-r--r--. 1 502 wheel 450 11月 7 2019 NOTICE.txt
-rw-r--r--. 1 502 wheel 111994 11月 7 2019 pom.xml
-rw-r--r--. 1 502 wheel 3048 11月 7 2019 README.md
-rw-r--r--. 1 502 wheel 160311 11月 7 2019 release-log.txt
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 server
drwxr-xr-x. 13 502 wheel 165 3月 30 20:41 sharelib
-rw-r--r--. 1 502 wheel 3520 11月 7 2019 source-headers.txt
drwxr-xr-x. 3 502 wheel 18 11月 7 2019 src
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 tools
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 webapp
drwxr-xr-x. 3 502 wheel 32 3月 30 20:41 zookeeper-security-tests
- 因为安装目录下缺少架包,需要把Hadoop安装目录下的架包考皮到Oozie下的新建文件夹
libext
下
[root@wq1 hdfs]# ll
总用量 11668
-rw-r--r--. 1 wuqi2020 ftp 8322603 7月 19 2018 hadoop-hdfs-2.7.7.jar
-rw-r--r--. 1 wuqi2020 ftp 3491623 7月 19 2018 hadoop-hdfs-2.7.7-tests.jar
-rw-r--r--. 1 wuqi2020 ftp 126078 7月 19 2018 hadoop-hdfs-nfs-2.7.7.jar
drwxr-xr-x. 2 wuqi2020 ftp 149 7月 19 2018 jdiff
drwxr-xr-x. 2 wuqi2020 ftp 4096 7月 19 2018 lib
drwxr-xr-x. 2 wuqi2020 ftp 85 7月 19 2018 sources
drwxr-xr-x. 2 wuqi2020 ftp 27 7月 19 2018 templates
drwxr-xr-x. 8 wuqi2020 ftp 92 7月 19 2018 webapps
[root@wq1 oozie-5.2.0]# cp HADOOP_HOME/share/hadoop/hdfs/*.jar libext
[root@wq1 oozie-5.2.0]# cpHADOOP_HOME/share/hadoop/hdfs/lib/*.jar libext
因为会有冲突架包,删除这两冲突架包
[root@wq1 libext]# mv servlet-api-2.5.jar servlet-api-2.5.jar.bak
[root@wq1 libext]# mv jsp-api-2.1.jar jsp-api-2.1.jar.bak
因为需要ext-2.2.zip
(Oozie客户端的插件),下载好后放到libext
下
点击下载ext-2.2.zip
- 装Maven软件后需要再下载下面这三个jar包放到maven安装目录的lib下(保证Oozie编译成功)
doxia-core-1.0-alpha-9.jar
doxia-module-twiki-1.0-alpha-9.jar
#下面这个目前没下载成功
pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar
vim pom.xml
编译前修改Hadoop的版本,我的Hadoop版本是2.7.7
<hadoop.version>2.7.7</hadoop.version>
<hadoop.majorversion>2</hadoop.majorversion>
<hadooplib.version>hadoop-{hadoop.majorversion}-{project.version}</hadooplib.version>
<hbase.version>1.2.3</hbase.version>
编译后的操作
在bin下执行mkdistro.sh -DskipTests -Dhadoop.version=2.7.7 -Puber
[root@wq1 bin]# mkdistro.sh -DskipTests -Dhadoop.version=2.7.7 -Puber
[INFO] Reactor Summary for Apache Oozie Main 5.2.0:
[INFO]
[INFO] Apache Oozie Main .................................. SUCCESS [ 39.971 s]
[INFO] Apache Oozie Fluent Job ............................ SUCCESS [ 0.209 s]
[INFO] Apache Oozie Fluent Job API ........................ SUCCESS [01:15 min]
[INFO] Apache Oozie Client ................................ SUCCESS [ 40.126 s]
[INFO] Apache Oozie Share Lib Oozie ....................... SUCCESS [ 16.992 s]
[INFO] Apache Oozie Share Lib HCatalog .................... SUCCESS [01:38 min]
[INFO] Apache Oozie Share Lib Distcp ...................... SUCCESS [ 3.100 s]
[INFO] Apache Oozie Core .................................. SUCCESS [06:17 min]
[INFO] Apache Oozie Share Lib Streaming ................... SUCCESS [ 10.941 s]
[INFO] Apache Oozie Share Lib Pig ......................... SUCCESS [03:52 min]
[INFO] Apache Oozie Share Lib Git ......................... SUCCESS [01:23 min]
[INFO] Apache Oozie Share Lib Hive ........................ SUCCESS [01:43 min]
[INFO] Apache Oozie Share Lib Hive 2 ...................... SUCCESS [ 9.205 s]
[INFO] Apache Oozie Share Lib Sqoop ....................... SUCCESS [ 10.110 s]
[INFO] Apache Oozie Examples .............................. SUCCESS [02:09 min]
[INFO] Apache Oozie Share Lib Spark ....................... SUCCESS [03:56 min]
[INFO] Apache Oozie Share Lib ............................. SUCCESS [ 22.054 s]
[INFO] Apache Oozie Docs .................................. SUCCESS [ 28.874 s]
[INFO] Apache Oozie WebApp ................................ SUCCESS [ 34.479 s]
[INFO] Apache Oozie Tools ................................. SUCCESS [ 5.971 s]
[INFO] Apache Oozie MiniOozie ............................. SUCCESS [ 5.130 s]
[INFO] Apache Oozie Fluent Job Client ..................... SUCCESS [ 4.530 s]
[INFO] Apache Oozie Server ................................ SUCCESS [ 58.123 s]
[INFO] Apache Oozie Distro ................................ SUCCESS [ 50.183 s]
[INFO] Apache Oozie ZooKeeper Security Tests .............. SUCCESS [01:11 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 29:33 min
[INFO] Finished at: 2021-04-02T14:10:16+08:00
[INFO] ------------------------------------------------------------------------
Oozie distro created, DATE[2021.04.02-05:39:38GMT] VC-REV[unavailable], available at [/opt/oozie-5.2.0/distro/target]
编译运行成功后会在/opt/oozie-5.2.0/distro/
target有一个 oozie-5.2.0-distro.tar.gz
,解压该压缩包重起名为oozieDistro-5.2.0
后面的操作在这个下面进行
编译完成后的文件夹
[root@wq1 oozieDistro-5.2.0]# ll
总用量 349132
drwxr-xr-x. 2 root root 282 4月 2 17:43 bin
drwxr-xr-x. 4 502 wheel 188 4月 2 16:47 conf
-rw-r--r--. 1 root root 7101 4月 2 14:06 docs.zip
drwxr-xr-x. 4 root root 85 4月 2 17:42 embedded-oozie-server
lrwxrwxrwx. 1 root root 75 4月 2 15:27 lib -> /opt/oozie-5.2.0/oozieDistro-5.2.0/embedded-oozie-server/webapp/WEB-INF/lib
drwxr-xr-x. 2 root root 8192 3月 31 13:31 libext
drwxr-xr-x. 2 root root 57 4月 2 14:30 libtools
-rw-r--r--. 1 502 wheel 37664 11月 7 2019 LICENSE.txt
drwxr-xr-x. 2 root root 4096 4月 2 17:02 logs
-rw-r--r--. 1 502 wheel 450 11月 7 2019 NOTICE.txt
-rw-r--r--. 1 root root 30459423 4月 2 13:43 oozie-client-5.2.0.tar.gz
drwxr-xr-x. 2 root root 68 4月 2 14:29 oozie-core
-rw-r--r--. 1 root root 738795 4月 2 14:01 oozie-examples.tar.gz
-rw-r--r--. 1 root root 326066688 4月 2 14:05 oozie-sharelib-5.2.0.tar.gz
-rw-r--r--. 1 502 wheel 3048 11月 7 2019 README.md
-rw-r--r--. 1 502 wheel 160311 11月 7 2019 release-log.txt
- 修改
vim conf/oozie-site.xml
下面这些信息包括连数据库的信息,连hdfs的信息,架包的存放位置,运行后如果与这些信息不匹配会有一个明显的报错,注意看生成的日志信息就能解决。
<configuration>
<property>
<name>oozie.service.JPAService.create.db.schema</name>
<value>true</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.driver</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.url</name>
<value>jdbc:mysql://wq1:3306/oozie?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.username</name>
<value>specialwu</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.password</name>
<value>specialWu7</value>
</property>
<property>
<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*=/opt/hadoop-2.7.7/etc/hadoop</value>
</property>
<property>
<name>oozie.service.WorkflowAppService.system.libpath</name>
<value>hdfs://wq1:9000/user/root/share/lib</value>
</property>
<property>
<name>oozie.subworkflow.classpath.inheritance</name>
<value>true</value>
</property>
</configuration>
- 需要把oozie-examples.tar.gz,oozie-sharelib-5.2.0.tar.gz解压后上传到hdfs集群/user/root下
hdfs dfs -put examples /user/root
hdfs dfs -put sharelib /user/root
vim /opt/hadoop-2.7.7/etc/hadoop/core-site.xml
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
vim /opt/hadoop-2.7.7/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
- vim /opt/hadoop-2.7.7/etc/hadoop/yarn-site.xml
<property>
<name>yarn.resourcemanager.address</name>
<value>wq1:8032</value>
</property>
- 启动jobhistory
mr-jobhistory-daemon.sh start historyserver
[root@wq1 hadoop]# jps
2960 NameNode
3108 DataNode
20086 Jps
6807 RunJar
3982 RunJar
20046 JobHistoryServer
3295 SecondaryNameNode
- 启动oozie
[root@wq1 oozieDistro-5.2.0]# oozie-start.sh
DONE
Creating composite indexes
DONE
Create OOZIE_SYS table
DONE
Oozie DB has been created for Oozie version '5.2.0'
The SQL commands have been written to: /tmp/ooziedb-7098635515234390519.sql
[root@wq1 bin]# oozie admin --oozie http://localhost:11000/oozie -status
System mode: NORMAL
- 实例文件
oozie-examples.tar.gz
解压后为examples
[root@wq1 map-reduce]# pwd
/opt/oozie-5.2.0/oozieDistro-5.2.0/examples/apps/map-reduce
[root@wq1 map-reduce]# ll
总用量 16
-rw-r--r--. 1 502 wheel 1011 3月 31 14:16 job.properties
-rw-r--r--. 1 502 wheel 1033 11月 7 2019 job-with-config-class.properties
drwxr-xr-x. 2 root root 38 4月 2 15:37 lib
-rw-r--r--. 1 502 wheel 2289 11月 7 2019 workflow-with-config-class.xml
-rw-r--r--. 1 502 wheel 2574 11月 7 2019 workflow.xml
vim job.properties
务必保证下面的信息与Hadoop中信息一致
nameNode=hdfs://wq1:9000
resourceManager=localhost:8032
queueName=default
examplesRoot=examples
oozie.wf.application.path={nameNode}/user/{user.name}/${examplesRoot}/apps/map-reduce/workflow.xml
outputDir=map-reduce
- 运行一个job
oozie job -oozie http://wq1:11000/oozie -config /opt/oozie-5.2.0/oozieDistro-5.2.0/examples/apps/map-reduce/job.properties -run
[root@wq1 bin]# pwd
/opt/oozie-5.2.0/oozieDistro-5.2.0/bin
[root@wq1 bin]# oozie job -oozie http://wq1:11000/oozie -config /opt/oozie-5.2.0/oozieDistro-5.2.0/examples/apps/map-reduce/job.properties -run
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/oozie-5.2.0/oozieDistro-5.2.0/embedded-oozie-server/webapp/WEB-INF/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/oozie-5.2.0/oozieDistro-5.2.0/embedded-oozie-server/webapp/WEB-INF/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/oozie-5.2.0/oozieDistro-5.2.0/libext/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
job: 0000000-210402153013353-oozie-root-W
查看运行信息
oozie job -oozie http://wq1:11000/oozie -info 0000000-210402153013353-oozie-root-W
[root@wq1 bin]# oozie job -oozie http://wq1:11000/oozie -info 0000000-210402174239217-oozie-root-W
Job ID : 0000000-210402174239217-oozie-root-W
------------------------------------------------------------------------------------------------------------------------------------
Workflow Name : map-reduce-wf
App Path : hdfs://wq1:9000/user/root/examples/apps/map-reduce/workflow.xml
Status : RUNNING
Run : 0
User : root
Group : -
Created : 2021-04-02 09:45 GMT
Started : 2021-04-02 09:45 GMT
Last Modified : 2021-04-02 09:45 GMT
Ended : -
CoordAction ID: -
Actions
------------------------------------------------------------------------------------------------------------------------------------
ID Status Ext ID Ext Status Err Code
------------------------------------------------------------------------------------------------------------------------------------
0000000-210402174239217-oozie-root-W@mr-node PREP - - -
------------------------------------------------------------------------------------------------------------------------------------
0000000-210402174239217-oozie-root-W@:start: OK - OK -
------------------------------------------------------------------------------------------------------------------------------------