本文是在CentOS7.4 下进行CDH6集群的彻底离线部署。CDH5集群与CDH6集群的部署区别比较大。html
说明:本文内容全部操做都是在root用户下进行的。java
首先一些安装CDH6集群的必须文件要先在外网环境先下载好。node
CM6 RPM:https://archive.cloudera.com/...
须要下载该连接下的全部RPM文件,保存到cloudera-repos
目录下。mysql
ASC文件:https://archive.cloudera.com/...
同时还须要下载一个asc文件,一样保存到cloudera-repos
目录下:linux
[root@node01 upload]# tree cloudera-repos/ cloudera-repos/ ├── allkeys.asc ├── cloudera-manager-agent-6.3.0-1281944.el7.x86_64.rpm ├── cloudera-manager-daemons-6.3.0-1281944.el7.x86_64.rpm ├── cloudera-manager-server-6.3.0-1281944.el7.x86_64.rpm ├── cloudera-manager-server-db-2-6.3.0-1281944.el7.x86_64.rpm ├── enterprise-debuginfo-6.3.0-1281944.el7.x86_64.rpm └── oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm
要求使用5.1.26以上版本的jdbc驱动,可点击这里直接下载mysql-connector-java-5.1.47.tar.gz
c++
注意:不要尝试使用FTP搭建CM的YUM库!sql
首先安装httpd
和createrepo
:数据库
yum -y install httpd createrepo
启动httpd
服务并设置开机自启动:apache
systemctl start httpd systemctl enable httpd
而后进入到前面准备好的存放Cloudera Manager RPM包的目录cloudera-repos
下:json
cd /data6/upload/cloudera-repos/
生成RPM元数据:
createrepo . chmod 777 -R cloudera-repos
而后将cloudera-repos
目录移动到httpd的html目录下:
mv cloudera-repos /var/www/html/
确保能够经过浏览器查看到这些RPM包:
接着在建立cm6的repo文件(每一个节点都须要配置):
cd /etc/yum.repos.d vim cloudera-manager.repo
添加以下内容:
[cloudera-manager] name=Cloudera Manager 6.3.0 baseurl=http://node01/cloudera-repos/ gpgcheck=0 enabled=1
保存,退出,而后执行yum clean all && yum makecache
命令:
[root@master02 ~]# yum clean all && yum makecache Loaded plugins: fastestmirror, langpacks Cleaning repos: ChinaUnicom-Packages cloudera-manager Cleaning up everything Maybe you want: rm -rf /var/cache/yum, to also free up space taken by orphaned data from disabled or removed repos Loaded plugins: fastestmirror, langpacks ChinaUnicom-Packages | 3.6 kB 00:00:00 cloudera-manager | 2.9 kB 00:00:00 (1/7): ChinaUnicom-Packages/group_gz | 156 kB 00:00:00 (2/7): ChinaUnicom-Packages/filelists_db | 3.1 MB 00:00:00 (3/7): ChinaUnicom-Packages/primary_db | 3.1 MB 00:00:00 (4/7): ChinaUnicom-Packages/other_db | 1.2 MB 00:00:00 (5/7): cloudera-manager/filelists_db | 118 kB 00:00:00 (6/7): cloudera-manager/other_db | 1.0 kB 00:00:00 (7/7): cloudera-manager/primary_db | 8.6 kB 00:00:00 Determining fastest mirrors Metadata Cache Created
这一步只须要在CM Server节点上操做。
执行下面的命令:
# 安装openjdk8 yum install oracle-j2sdk1.8 # 安装 cm manager(只需在server节点安装) yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server
将会须要不少依赖包,因此说仍是有必要搭一个局域网内yum源的,或者手动安装rpm包
Cloudera Manager Server安装完成后,进入到本地Parcel存储库目录:
cd /opt/cloudera/parcel-repo
将第一部分下载的CDH parcels文件上传至该目录下,而后执行修改sha文件:
mv /data6/upload/parcels/* /opt/cloudera/parcel-repo/ mv CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha1 CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha
而后执行下面的命令修改文件全部者:
chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
最终/opt/cloudera/parcel-repo
目录内容以下:
├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel ├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha └── manifest.json
MySQL的安装在环境准备部分中已经有说明,这里就跳过MySQL安装了。
CDH官方给的有一份推荐的MySQL的配置内容:
[mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock transaction-isolation = READ-COMMITTED # Disabling symbolic-links is recommended to prevent assorted security risks; # to do so, uncomment this line: symbolic-links = 0 key_buffer_size = 32M max_allowed_packet = 32M thread_stack = 256K thread_cache_size = 64 query_cache_limit = 8M query_cache_size = 64M query_cache_type = 1 max_connections = 550 #expire_logs_days = 10 #max_binlog_size = 100M #log_bin should be on a disk with enough free space. #Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your #system and chown the specified folder to the mysql user. log_bin=/var/lib/mysql/mysql_binary_log #In later versions of MySQL, if you enable the binary log and do not set #a server_id, MySQL will not start. The server_id must be unique within #the replicating group. server_id=1 binlog_format = mixed read_buffer_size = 2M read_rnd_buffer_size = 16M sort_buffer_size = 8M join_buffer_size = 8M # InnoDB settings innodb_file_per_table = 1 innodb_flush_log_at_trx_commit = 2 innodb_log_buffer_size = 64M innodb_buffer_pool_size = 4G innodb_thread_concurrency = 8 innodb_flush_method = O_DIRECT innodb_log_file_size = 512M [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid sql_mode=STRICT_ALL_TABLES
从前面下载好的mysql-connector-java-5.1.47.tar.gz
包中解压出mysql-connector-java-5.1.47-bin.jar
文件,将mysql-connector-java-5.1.47-bin.jar
文件上传至CM Server节点上的/usr/share/java/
目录下并重命名为mysql-connector-java.jar
(若是/usr/share/java/
目录不存在,须要手动建立):
tar zxvf mysql-connector-java-5.1.47.tar.gz mkdir -p /usr/share/java/ cp mysql-connector-java-5.1.47-bin.jar /usr/share/java/mysql-connector-java.jar
根据所须要安装的服务参照下表建立对应的数据库以及数据库用户,数据库必须使用utf8编码,建立数据库时要记录好用户名及对应密码:
服务名 | 数据库名 | 用户名 |
---|---|---|
Cloudera Manager Server | scm | scm |
Activity Monitor | amon | amon |
Reports Manager | rman | rman |
Hue | hue | hue |
Hive Metastore Server | metastore | hive |
Sentry Server | sentry | sentry |
Cloudera Navigator Audit Server | nav | nav |
Cloudera Navigator Metadata Server | navms | navms |
Oozie | oozie | oozie |
建立数据库及对应用户:
# scm CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm'; # amon CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon'; # rman CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman'; # hue CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue'; # hive CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive'; # sentry CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry'; # nav CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav'; # navms CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms'; # oozie CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie'; # flush FLUSH PRIVILEGES;
Cloudera Manager Server包含一个配置数据库的脚本。
/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h <mysql-host-ip> --scm-host <cm-server-ip> scm scm
[root@master02 ~]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h 10.172.54.51 --scm-host 10.172.54.52 scm scm Enter SCM password: JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera Verifying that we can write to /etc/cloudera-scm-server Creating SCM configuration file in /etc/cloudera-scm-server Executing: /usr/java/jdk1.8.0_181-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db. [ main] DbCommandExecutor INFO Successfully connected to database. All done, your SCM database is configured correctly!
systemctl start cloudera-scm-server
而后等待Cloudera Manager Server启动,可能须要稍等一下子,能够经过命令tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
去监控服务启动状态。
当看到INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
日志打印出来后,说明服务启动成功,能够经过浏览器访问Cloudera Manager WEB界面了。
打开浏览器,访问地址:http://<server_host>:7180
,默认帐号和密码都为admin:
首先是Cloudera Manager的欢迎页面,点击页面右下角的【继续】按钮进行下一步:
### 接受条款
勾选接受条款,点击【继续】进行下一步:
### 版本选择
这里我就选择免费版了:
选择版本之后会出现第二个欢迎界面,不过这个是安装集群的欢迎页:
这一步是要搜索并选择用于安装CDH集群的主机,在主机名称后面的输入框中输入各个节点的hostname,中间使用英文逗号分隔开,而后点击搜索,在结果列表中勾选要安装CDH的节点便可:
这里选择自定义,填写上面使用httpd搭建好的Cloudera Manager YUM 库URL:
若是咱们以前的【配置本地Parcel存储库】步骤操做无误的话,这里会自动选择【使用Parcel】,并加载出CDH版本,确认无误后点击【继续】:
所以,不须要本身手动安装 Cloudera Manager Agent了
这一步骤我就再也不勾选安装JDK了,由于我在环境准备部分已经安装过了。取消勾选,而后继续:
用于配置集群主机之间的SSH登陆,填写root用户的密码,根据集群配置填写合适的【同时安装数量】值便可:
到这一步会自动进行节点Agent的安装,稍等一下子,便可安装完成:
这一步一样是自动安装,分配步骤的速度主要取决于网络环境,耐心等待便可:
等待检查完成便可:
Cloudera 建议将 /proc/sys/vm/swappiness
设置为最大值 10。当前设置为 30。使用 sysctl
命令在运行时更改该设置并编辑 /etc/sysctl.conf
,以在重启后保存该设置。您能够继续进行安装,但 Cloudera Manager 可能会报告您的主机因为交换而运行情况不良。
临时修改:
sysctl vm.swappiness=10 cat /proc/sys/vm/swappiness
这里咱们的修改已经生效,可是若是咱们重启了系统,又会变成30.
永久修改:
在/etc/sysctl.conf
文件里添加以下参数:
vm.swappiness=10
或者:
echo 'vm.swappiness=10'>> /etc/sysctl.conf
已启用透明大页面压缩,可能会致使重大性能问题。请运行echo never > /sys/kernel/mm/transparent_hugepage/defrag
和echo never > /sys/kernel/mm/transparent_hugepage/enabled
以禁用此设置,而后将同一命令添加到 /etc/rc.local
等初始化脚本中,以便在系统重启时予以设置。
安装上面的提示执行便可;
这里我选择自定义服务,Zookeeper, HDFS,Yarn:
能够先安装基础组件,而后用到啥在安装啥若是全部服务都安装,可能安装过程当中会出现不少问题
CDH会自动给出一个角色分配,若是以为不合理,咱们能够手动调整一下,注意角色分配均衡:
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0 Exception in thread "main" java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-1.8.so, Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni.so, no leveldbjni64-1.8 in java.library.path, no leveldbjni-1.8 in java.library.path, no leveldbjni in java.library.path, /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-64-1-4792575431304239050.8: libstdc++.so.6: : , /tmp/libleveldbjni-64-1-6079277982211108711.8: libstdc++.so.6: : ]
出错缘由:当前节点的glibc升级有关。既然不存在leveldbjni的库,那便给他安装一个。
安装leveldbjni库的方式很是有趣:
1) 首先下载leveldbjni-all-1.8.jar
2)解压该jar包,在META-INFnativelinux64目录下找到libleveldbjni.so文件
3) 将libleveldbjni.so文件上传到1中java.library.path中
若是由于其余缘由,须要卸载Cloudera Manager,在各节点执行以下步骤便可。
systemctl stop cloudera-scm-server systemctl stop cloudera-scm-agent yum -y remove 'cloudera-manager-*' yum clean all umount cm_processes umount /var/run/cloudera-scm-agent/process rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera* /var/log/cloudera* /var/run/cloudera* rm -rf /tmp/.scmpreparenode.lock rm -Rf /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper rm -rf /var/lib/hadoop-* /var/lib/impala /var/lib/solr /var/lib/zookeeper /var/lib/hue /var/lib/oozie /var/lib/pgsql /var/lib/sqoop2 /data/dfs/ /data/impala/ /data/yarn/ /dfs/ /impala/ /yarn/ /var/run/hadoop-*/ /var/run/hdfs-*/ /usr/bin/hadoop* /usr/bin/zookeeper* /usr/bin/hbase* /usr/bin/hive* /usr/bin/hdfs /usr/bin/mapred /usr/bin/yarn /usr/bin/sqoop* /usr/bin/oozie /etc/hadoop* /etc/zookeeper* /etc/hive* /etc/hue /etc/impala /etc/sqoop* /etc/oozie /etc/hbase* /etc/hcatalog systemctl stop mariadb yum -y remove mariadb-* rm -rf /var/lib/mysql rm -rf /var/log/mysqld.log rm -rf /usr/lib64/mysql rm -rf /usr/share/mysql rm -rf /opt/cloudera
ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
2019-08-27 20:35:50,469 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server. 2019-08-27 20:35:50,600 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest java.util.concurrent.ExecutionException: java.net.UnknownHostException: archive.cloudera.com: Name or service not known at com.ning.http.client.providers.netty.future.NettyResponseFuture.abort(NettyResponseFuture.java:231) at com.ning.http.client.providers.netty.request.NettyRequestSender.abort(NettyRequestSender.java:422) at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:290) at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithCertainForceConnect(NettyRequestSender.java:142) at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequest(NettyRequestSender.java:117) at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.execute(NettyAsyncHttpProvider.java:87) at com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:506) at com.ning.http.client.AsyncHttpClient$BoundRequestBuilder.execute(AsyncHttpClient.java:229) at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfoFuture(ParcelDownloaderImpl.java:592) at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfo(ParcelDownloaderImpl.java:544) at com.cloudera.parcel.components.ParcelDownloaderImpl.syncRemoteRepos(ParcelDownloaderImpl.java:357) at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:464) at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:459) at com.cloudera.cmf.persist.ReadWriteDatabaseTaskCallable.call(ReadWriteDatabaseTaskCallable.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.UnknownHostException: archive.cloudera.com: Name or service not known at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) at java.net.InetAddress.getAllByName0(InetAddress.java:1276) at java.net.InetAddress.getAllByName(InetAddress.java:1192) at java.net.InetAddress.getAllByName(InetAddress.java:1126) at java.net.InetAddress.getByName(InetAddress.java:1076) at com.ning.http.client.NameResolver$JdkNameResolver.resolve(NameResolver.java:28) at com.ning.http.client.providers.netty.request.NettyRequestSender.remoteAddress(NettyRequestSender.java:358) at com.ning.http.client.providers.netty.request.NettyRequestSender.connect(NettyRequestSender.java:369) at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:283) ... 15 more
不影响使用
安装hive报错:org.apache.hadoop.hive.metastore.HiveMetaException: Failed to retrieve schema tables from Hive Metastore DB,Not supported
[root@master01 ~]# rpm -qa|grep mysql-connector-java mysql-connector-java-5.1.25-3.el7.noarch
jdbc版本不对,要求使用5.1.26以上版本的jdbc驱动