• 官网给出的介绍

Oozie is a workflow scheduler system to manage Apache Hadoop jobs.

Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.

Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time (frequency) and data availability.

Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).

Oozie is a scalable, reliable and extensible system.

Developers interested in getting more involved with Oozie may join the mailing lists, report bugs, retrieve code from the version control system, and make contributions.

一、是什么

平时可用来处理定时任务,可以串行执行shell脚本,hive脚本,sqoop脚本等

二、怎么么用

老规矩,用之前还得知道是怎么配置的;
配进行下面的操作前,保证hdfs启动了,Maven安装好了,MySQL启动了
配置的步骤可分为编译前的操作,和编译后的操作,前一步的操作在你安装的目录下进行,第二次操作在你编译后生成的目录下操作。

编译前的操作

  • 下载安装包
http://archive.apache.org/dist/oozie/5.2.0/

查看解压后的oozie安装目录下都有哪些文件,我的安装位置在/opt/oozie-5.2.0

[root@wq1 oozie-5.2.0]# ll
总用量 340
drwxr-xr-x.  2  502 wheel   4096 3月  30 20:41 bin
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 client
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 core
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 distro
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 docs
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 examples
drwxr-xr-x.  4  502 wheel     68 3月  30 20:41 fluent-job
drwxr-xr-x.  2 root root    8192 3月  30 21:02 libext
-rw-r--r--.  1  502 wheel  37664 11月  7 2019 LICENSE.txt
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 minitest
-rw-r--r--.  1  502 wheel    450 11月  7 2019 NOTICE.txt
-rw-r--r--.  1  502 wheel 111994 11月  7 2019 pom.xml
-rw-r--r--.  1  502 wheel   3048 11月  7 2019 README.md
-rw-r--r--.  1  502 wheel 160311 11月  7 2019 release-log.txt
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 server
drwxr-xr-x. 13  502 wheel    165 3月  30 20:41 sharelib
-rw-r--r--.  1  502 wheel   3520 11月  7 2019 source-headers.txt
drwxr-xr-x.  3  502 wheel     18 11月  7 2019 src
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 tools
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 webapp
drwxr-xr-x.  3  502 wheel     32 3月  30 20:41 zookeeper-security-tests
  • 因为安装目录下缺少架包,需要把Hadoop安装目录下的架包考皮到Oozie下的新建文件夹libext
[root@wq1 hdfs]# ll
总用量 11668
-rw-r--r--. 1 wuqi2020 ftp 8322603 7月  19 2018 hadoop-hdfs-2.7.7.jar
-rw-r--r--. 1 wuqi2020 ftp 3491623 7月  19 2018 hadoop-hdfs-2.7.7-tests.jar
-rw-r--r--. 1 wuqi2020 ftp  126078 7月  19 2018 hadoop-hdfs-nfs-2.7.7.jar
drwxr-xr-x. 2 wuqi2020 ftp     149 7月  19 2018 jdiff
drwxr-xr-x. 2 wuqi2020 ftp    4096 7月  19 2018 lib
drwxr-xr-x. 2 wuqi2020 ftp      85 7月  19 2018 sources
drwxr-xr-x. 2 wuqi2020 ftp      27 7月  19 2018 templates
drwxr-xr-x. 8 wuqi2020 ftp      92 7月  19 2018 webapps
[root@wq1 oozie-5.2.0]# cp HADOOP_HOME/share/hadoop/hdfs/*.jar libext
[root@wq1 oozie-5.2.0]# cpHADOOP_HOME/share/hadoop/hdfs/lib/*.jar libext

因为会有冲突架包,删除这两冲突架包

[root@wq1 libext]# mv servlet-api-2.5.jar servlet-api-2.5.jar.bak
[root@wq1 libext]# mv jsp-api-2.1.jar jsp-api-2.1.jar.bak

因为需要ext-2.2.zip(Oozie客户端的插件),下载好后放到libext
点击下载ext-2.2.zip

  • 装Maven软件后需要再下载下面这三个jar包放到maven安装目录的lib下(保证Oozie编译成功)
doxia-core-1.0-alpha-9.jar
doxia-module-twiki-1.0-alpha-9.jar
#下面这个目前没下载成功
pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar
  • vim pom.xml

编译前修改Hadoop的版本,我的Hadoop版本是2.7.7

<hadoop.version>2.7.7</hadoop.version>
<hadoop.majorversion>2</hadoop.majorversion>
<hadooplib.version>hadoop-{hadoop.majorversion}-{project.version}</hadooplib.version>
<hbase.version>1.2.3</hbase.version>

编译后的操作

在bin下执行mkdistro.sh -DskipTests -Dhadoop.version=2.7.7 -Puber

[root@wq1 bin]# mkdistro.sh -DskipTests -Dhadoop.version=2.7.7 -Puber
[INFO] Reactor Summary for Apache Oozie Main 5.2.0:
[INFO] 
[INFO] Apache Oozie Main .................................. SUCCESS [ 39.971 s]
[INFO] Apache Oozie Fluent Job ............................ SUCCESS [  0.209 s]
[INFO] Apache Oozie Fluent Job API ........................ SUCCESS [01:15 min]
[INFO] Apache Oozie Client ................................ SUCCESS [ 40.126 s]
[INFO] Apache Oozie Share Lib Oozie ....................... SUCCESS [ 16.992 s]
[INFO] Apache Oozie Share Lib HCatalog .................... SUCCESS [01:38 min]
[INFO] Apache Oozie Share Lib Distcp ...................... SUCCESS [  3.100 s]
[INFO] Apache Oozie Core .................................. SUCCESS [06:17 min]
[INFO] Apache Oozie Share Lib Streaming ................... SUCCESS [ 10.941 s]
[INFO] Apache Oozie Share Lib Pig ......................... SUCCESS [03:52 min]
[INFO] Apache Oozie Share Lib Git ......................... SUCCESS [01:23 min]
[INFO] Apache Oozie Share Lib Hive ........................ SUCCESS [01:43 min]
[INFO] Apache Oozie Share Lib Hive 2 ...................... SUCCESS [  9.205 s]
[INFO] Apache Oozie Share Lib Sqoop ....................... SUCCESS [ 10.110 s]
[INFO] Apache Oozie Examples .............................. SUCCESS [02:09 min]
[INFO] Apache Oozie Share Lib Spark ....................... SUCCESS [03:56 min]
[INFO] Apache Oozie Share Lib ............................. SUCCESS [ 22.054 s]
[INFO] Apache Oozie Docs .................................. SUCCESS [ 28.874 s]
[INFO] Apache Oozie WebApp ................................ SUCCESS [ 34.479 s]
[INFO] Apache Oozie Tools ................................. SUCCESS [  5.971 s]
[INFO] Apache Oozie MiniOozie ............................. SUCCESS [  5.130 s]
[INFO] Apache Oozie Fluent Job Client ..................... SUCCESS [  4.530 s]
[INFO] Apache Oozie Server ................................ SUCCESS [ 58.123 s]
[INFO] Apache Oozie Distro ................................ SUCCESS [ 50.183 s]
[INFO] Apache Oozie ZooKeeper Security Tests .............. SUCCESS [01:11 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  29:33 min
[INFO] Finished at: 2021-04-02T14:10:16+08:00
[INFO] ------------------------------------------------------------------------

Oozie distro created, DATE[2021.04.02-05:39:38GMT] VC-REV[unavailable], available at [/opt/oozie-5.2.0/distro/target]

编译运行成功后会在/opt/oozie-5.2.0/distro/target有一个 oozie-5.2.0-distro.tar.gz,解压该压缩包重起名为oozieDistro-5.2.0后面的操作在这个下面进行
编译完成后的文件夹

[root@wq1 oozieDistro-5.2.0]# ll
总用量 349132
drwxr-xr-x. 2 root root        282 4月   2 17:43 bin
drwxr-xr-x. 4  502 wheel       188 4月   2 16:47 conf
-rw-r--r--. 1 root root       7101 4月   2 14:06 docs.zip
drwxr-xr-x. 4 root root         85 4月   2 17:42 embedded-oozie-server
lrwxrwxrwx. 1 root root         75 4月   2 15:27 lib -> /opt/oozie-5.2.0/oozieDistro-5.2.0/embedded-oozie-server/webapp/WEB-INF/lib
drwxr-xr-x. 2 root root       8192 3月  31 13:31 libext
drwxr-xr-x. 2 root root         57 4月   2 14:30 libtools
-rw-r--r--. 1  502 wheel     37664 11月  7 2019 LICENSE.txt
drwxr-xr-x. 2 root root       4096 4月   2 17:02 logs
-rw-r--r--. 1  502 wheel       450 11月  7 2019 NOTICE.txt
-rw-r--r--. 1 root root   30459423 4月   2 13:43 oozie-client-5.2.0.tar.gz
drwxr-xr-x. 2 root root         68 4月   2 14:29 oozie-core
-rw-r--r--. 1 root root     738795 4月   2 14:01 oozie-examples.tar.gz
-rw-r--r--. 1 root root  326066688 4月   2 14:05 oozie-sharelib-5.2.0.tar.gz
-rw-r--r--. 1  502 wheel      3048 11月  7 2019 README.md
-rw-r--r--. 1  502 wheel    160311 11月  7 2019 release-log.txt
  • 修改vim conf/oozie-site.xml

下面这些信息包括连数据库的信息,连hdfs的信息,架包的存放位置,运行后如果与这些信息不匹配会有一个明显的报错,注意看生成的日志信息就能解决。

<configuration>
<property>  
    <name>oozie.service.JPAService.create.db.schema</name>  
    <value>true</value>  
</property>  
<property>  
    <name>oozie.service.JPAService.jdbc.driver</name>  
    <value>com.mysql.jdbc.Driver</value>  
</property>  
<property>  
    <name>oozie.service.JPAService.jdbc.url</name>  
    <value>jdbc:mysql://wq1:3306/oozie?createDatabaseIfNotExist=true</value>  
</property>  
<property>  
    <name>oozie.service.JPAService.jdbc.username</name>  
    <value>specialwu</value>  
</property>  
<property>  
    <name>oozie.service.JPAService.jdbc.password</name>  
    <value>specialWu7</value>  
</property>  
<property>  
    <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>  
    <value>*=/opt/hadoop-2.7.7/etc/hadoop</value>  
</property>
<property>
        <name>oozie.service.WorkflowAppService.system.libpath</name>
        <value>hdfs://wq1:9000/user/root/share/lib</value>
</property>
    <property>
        <name>oozie.subworkflow.classpath.inheritance</name>
        <value>true</value>
</property>
</configuration>
  • 需要把oozie-examples.tar.gz,oozie-sharelib-5.2.0.tar.gz解压后上传到hdfs集群/user/root下
hdfs dfs -put examples /user/root
hdfs dfs -put sharelib /user/root
  • vim /opt/hadoop-2.7.7/etc/hadoop/core-site.xml
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
  • vim /opt/hadoop-2.7.7/etc/hadoop/mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
  • vim /opt/hadoop-2.7.7/etc/hadoop/yarn-site.xml
<property>
<name>yarn.resourcemanager.address</name>
<value>wq1:8032</value>
</property>
  • 启动jobhistory
mr-jobhistory-daemon.sh start historyserver
[root@wq1 hadoop]# jps
2960 NameNode
3108 DataNode
20086 Jps
6807 RunJar
3982 RunJar
20046 JobHistoryServer
3295 SecondaryNameNode
  • 启动oozie
[root@wq1 oozieDistro-5.2.0]# oozie-start.sh
DONE
Creating composite indexes
DONE
Create OOZIE_SYS table
DONE

Oozie DB has been created for Oozie version '5.2.0'


The SQL commands have been written to: /tmp/ooziedb-7098635515234390519.sql
[root@wq1 bin]# oozie admin --oozie http://localhost:11000/oozie -status
System mode: NORMAL
  • 实例文件oozie-examples.tar.gz解压后为examples
[root@wq1 map-reduce]# pwd
/opt/oozie-5.2.0/oozieDistro-5.2.0/examples/apps/map-reduce
[root@wq1 map-reduce]# ll
总用量 16
-rw-r--r--. 1  502 wheel 1011 3月  31 14:16 job.properties
-rw-r--r--. 1  502 wheel 1033 11月  7 2019 job-with-config-class.properties
drwxr-xr-x. 2 root root    38 4月   2 15:37 lib
-rw-r--r--. 1  502 wheel 2289 11月  7 2019 workflow-with-config-class.xml
-rw-r--r--. 1  502 wheel 2574 11月  7 2019 workflow.xml
  • vim job.properties务必保证下面的信息与Hadoop中信息一致
nameNode=hdfs://wq1:9000
resourceManager=localhost:8032
queueName=default
examplesRoot=examples

oozie.wf.application.path={nameNode}/user/{user.name}/${examplesRoot}/apps/map-reduce/workflow.xml
outputDir=map-reduce
  • 运行一个job

oozie job -oozie http://wq1:11000/oozie -config /opt/oozie-5.2.0/oozieDistro-5.2.0/examples/apps/map-reduce/job.properties -run

[root@wq1 bin]# pwd
/opt/oozie-5.2.0/oozieDistro-5.2.0/bin
[root@wq1 bin]# oozie job -oozie http://wq1:11000/oozie -config /opt/oozie-5.2.0/oozieDistro-5.2.0/examples/apps/map-reduce/job.properties -run
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/oozie-5.2.0/oozieDistro-5.2.0/embedded-oozie-server/webapp/WEB-INF/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/oozie-5.2.0/oozieDistro-5.2.0/embedded-oozie-server/webapp/WEB-INF/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/oozie-5.2.0/oozieDistro-5.2.0/libext/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
job: 0000000-210402153013353-oozie-root-W

查看运行信息
oozie job -oozie http://wq1:11000/oozie -info 0000000-210402153013353-oozie-root-W

[root@wq1 bin]# oozie job -oozie http://wq1:11000/oozie -info  0000000-210402174239217-oozie-root-W
Job ID : 0000000-210402174239217-oozie-root-W
------------------------------------------------------------------------------------------------------------------------------------
Workflow Name : map-reduce-wf
App Path      : hdfs://wq1:9000/user/root/examples/apps/map-reduce/workflow.xml
Status        : RUNNING
Run           : 0
User          : root
Group         : -
Created       : 2021-04-02 09:45 GMT
Started       : 2021-04-02 09:45 GMT
Last Modified : 2021-04-02 09:45 GMT
Ended         : -
CoordAction ID: -

Actions
------------------------------------------------------------------------------------------------------------------------------------
ID                                                                            Status    Ext ID                 Ext Status Err Code  
------------------------------------------------------------------------------------------------------------------------------------
0000000-210402174239217-oozie-root-W@mr-node                                  PREP      -                      -          -         
------------------------------------------------------------------------------------------------------------------------------------
0000000-210402174239217-oozie-root-W@:start:                                  OK        -                      OK         -         
------------------------------------------------------------------------------------------------------------------------------------