安装配置HUE

下载地址http://archive.cloudera.com/cdh5/cdh/5/
建议离线下载好后再传到集群上169M 3.9.0版本
点击查看官方配置文档

通过该命令可查找文件中某一字符对应的行数

[root@wq1 conf]# grep -n beeswax hue.ini 
1022:[beeswax]

所需依赖

yum install -y gcc libxml2-devel libxslt-devel cyrus-sasl-devel mysql-devel python-devel python-setuptools python-simplejson sqlite-devel ant gmp-devel cyrus-sasl-plain cyrus-sasl-devel cyrus-sasl-gssapi libffi-devel openldap-devel
yum install -y asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc gcc-c++ krb5-devel libffi-devel libxml2-devel libxslt-devel make openldap-devel python-devel sqlite-devel gmp-devel

安装位置/opt/hue-3.9.0

[root@wq1 hue-3.9.0]# vim desktop/conf/hue.ini

第一步:vim hui.ini

#通用配置
[desktop]
secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o
http_host=wq1
is_hue_4=true
time_zone=Asia/Shanghai
server_user=root
server_group=root
default_user=root
default_hdfs_superuser=root
#配置使用mysql作为hue的存储数据库,大概在hue.ini的617行处
[[database]]
engine=mysql
host=wq2
port=3306
user=specialwu
password=specialwu
name=hue

第二步:在MySQL中创建数据库存放元数据

create database hue default character set utf8 default collate utf8_general_ci;

第三步:编译(在自己安装目录下执行)

[root@wq1 hue-3.9.0]# make apps

第四步:启动(添加hue用户后启动)

[root@wq1 hue-3.9.0]# useradd hue;
[root@wq1 hue-3.9.0]# build/env/bin/supervisor

集成hdfs和yarn

vim hue.ini

#910行处
[[hdfs_clusters]]
    [[[default]]]
fs_defaultfs=hdfs://wq1:9000
webhdfs_url=http://node01:50070/webhdfs/v1
hadoop_conf_dir=/opt/hadoop-2.7.7/etc/hadoop
#937行处
[[yarn_clusters]]
    [[[default]]]
      resourcemanager_host=wq1
      resourcemanager_port=8032
      submit_to=True
      resourcemanager_api_url=http://wq1:8088
      history_server_api_url=http://wq1:19888

Hadoop上的配置文件
/opt/hadoop-2.7.7/etc/hadoop 这三个配置文件都加上对应依赖

hdfs-site.xml

<property> 
  <name> dfs.webhdfs.enabled </ name> 
  <value> true </ value> 
</ property>

core-site.xml

<property>
 <name>hadoop.proxyuser.root.hosts</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.root.groups</name>
  <value>*</value>
</property>

httpfs-site.xmlhttpfs.sh start

<property>
  <name>httpfs.proxyuser.root.hosts</name>
  <value>*</value>
</property>
<property>
  <name>httpfs.proxyuser.root.groups</name>
  <value>*</value>
</property>

vim mapred-site.xml 启动jobhistorymr-jobhistory-daemon.sh

    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
<property>
    <name>mapreduce.jobhistory.address</name>
    <value>wq1:10020</value>
</property>
<property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>wq1:19888</value>
</property>

如果上面的没配置you are a Hue admin but not a HDFS superuser, "hdfs" or part of HDFS supergroup, "supergroup
vim yarn-site.xml

<property>
<name>yarn.resourcemanager.address</name>
<value>wq1:8032</value>
</property>
##是否启用日志聚集功能
<property>  
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
##设置日志保留时间,单位是秒。
<property>  
<name>yarn.log-aggregation.retain-seconds</name>
<value>106800</value>
</property>

start-yarn.sh,star-dfs.sh

集成hive

vim hue.ini

[beeswax]
  hive_server_host=wq1
  hive_server_port=10000
  hive_conf_dir=/opt/hive-1.2.2/conf
  server_conn_timeout=120
[metastore]
  #允许使用hive创建数据库表等操作
  enable_new_create_table=true

如果没启动服务则会出现hive连接10000端口失败
启动hive,重启hue:启动hue /opt/hue-3.9.0/build/env/bin supervisor

hive
hiveserver2
supervisor

集成MySQL

vim hue.ini
注意此时这个是集成MySQL,文章最开始连接MySQL是存储元数据

#1578行
[[[mysql]]]
      nice_name="My SQL DB"
      engine=mysql
      host=wq2
      port=3306
      user=specialwu
      password=specialwu

集成oozie

vim hue.ini

  #1185行
  [oozie]
  # Location on local FS where the examples are stored.
  # local_data_dir=/export/servers/oozie-4.1.0-cdh5.14.0/examples/apps

  # Location on local FS where the data for the examples is stored.
  # sample_data_dir=/export/servers/oozie-4.1.0-cdh5.14.0/examples/input-data

  # Location on HDFS where the oozie examples and workflows are stored.
  # Parameters are TIME andUSER, e.g. /user/USER/hue/workspaces/workflow-TIME
  # remote_data_dir=/user/root/oozie_works/examples/apps

  # Maximum of Oozie workflows or coodinators to retrieve in one API call.
  oozie_jobs_count=100

  # Use Cron format for defining the frequency of a Coordinator instead of the old frequency number/unit.
  enable_cron_scheduling=true

  # Flag to enable the saved Editor queries to be dragged and dropped into a workflow.
  enable_document_action=true

  # Flag to enable Oozie backend filtering instead of doing it at the page level in Javascript. Requires Oozie 4.3+.
  enable_oozie_backend_filtering=true

  # Flag to enable the Impala action.
  enable_impala_action=true
  #1216行
  [filebrowser]
  # Location on local filesystem where the uploaded archives are temporary stored.
  archive_upload_tempdir=/tmp

  # Show Download Button for HDFS file browser.
  show_download_button=true

  # Show Upload Button for HDFS file browser.
  show_upload_button=true

  # Flag to enable the extraction of a uploaded archive in HDFS.
  enable_extract_uploaded_archive=true
  #1442行
[liboozie]
  # The URL where the Oozie service runs on. This is required in order for
  # users to submit jobs. Empty value disables the config check.
  oozie_url=http://wq1:11000/oozie
  # Location on HDFS where the workflows/coordinator are deployed when submitted.
  remote_deployement_dir=/user/root/oozie_works

vim oozie-site.xml

<property>                                         
        <name>oozie.service.ProxyUserService.proxyuser.root.hosts</name>
        <value>*</value>
</property>
<property>
        <name>oozie.service.ProxyUserService.proxyuser.root.groups</name>
        <value>*</value>
</property>

启动oozie /bin oozie-start.sh

各个服务添加后查看进程

wq1
89665 Jps
84562 SecondaryNameNode
85924 ResourceManager
86102 NodeManager
5240 EmbeddedOozieServer
84201 NameNode
117755 JobHistoryServer
84349 DataNode

使用过程中异常问题解决

使用hivesql语句执行完之后之后如下报错

The auxService:mapreduce_shuffle does not exist

vim yarn-site.xml,之后重启yarn

<property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
</property>