OS: CentOS-6.5-x86_64
JDK: jdk-8u111-linux-x64
Hadoop: hadoop-2.6.5html
//查看当前主机名 # hostname //修改当前主机名 # vim /etc/sysconfig/network NETWORKING 是否利用网络 GATEWAY 默认网关 IPGATEWAYDEV 默认网关的接口名 HOSTNAME 主机名 DOMAIN 域名
# vim /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE 接口名(设备,网卡) BOOTPROTO IP的配置方法(static:固定IP, dhcpHCP, none:手动) HWADDR MAC地址 ONBOOT 系统启动的时候网络接口是否有效(yes/no) TYPE 网络类型(一般是Ethemet) NETMASK 网络掩码 IPADDR IP地址 IPV6INIT IPV6是否有效(yes/no) GATEWAY 默认网关IP地址 DNS1 DNS2
个人配置以下:java
DEVICE=eth0 HWADDR=00:0C:29:D3:53:77 TYPE=Ethernet UUID=84d51ff5-228e-44ae-812d-7e59aa190715 ONBOOT=yes NM_CONTROLLED=yes BOOTPROTO=static IPADDR=192.168.1.10 GATEWAY=192.168.1.1 //虚拟机下NAT网络模式这两项不用配置 DNS1=202.204.65.5 DNS2=202.204.65.6
# vim /etc/hosts 192.168.1.10 master 192.168.1.11 slave1 192.168.1.12 slave2
//临时关闭 # service iptables stop //永久关闭 # chkconfig iptables off # service ip6tables stop # chkconfig ip6tables off
# vim /etc/sysconfig/selinux SELINUX=enforcing //更改成以下配置 SELINUX=disable
接着执行以下命令node
# setenforce 0 # getenforce
若是只有root
用户或者没有hadoop
用户的状况下:linux
//新增用户 # useradd hadoop //设置密码 # passwd hadoop //根据提示输入两次密码
在全部节点执行一直按回车就能够了。git
$ su hadoop $ ssh-keygen -t rsa
将msater
的id_rsa.pub
追加到受权key中(只须要将master
节点的公钥追加到authorized_keys
)web
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
shell
更改authorized_keys
的权限,分别在全部节点操做apache
chomd 600 authorized_keys
vim
将authorized_keys
复制到全部slave
节点segmentfault
$ scp ~/.ssh/authorized_keys hadoop@192.168.1.11:~/.ssh/ $ scp ~/.ssh/authorized_keys hadoop@192.168.1.12:~/.ssh/
master
免密钥登录全部slave
节点
$ ssh slave1 $ ssh slave2
$ tar -zvxf hadoop-2.6.5.tar.gz $ mv hadoop-2.6.5 ~/cloud/ $ ln -s /home/hadoop/cloud/hadoop-2.6.5 /home/hadoop/cloud/hadoop
在尾部追加
# vim /etc/profile # set hadoop environment export HADOOP_HOME=/home/hadoop/cloud/hadoop export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_YARN_HOME=$HADOOP_HOME export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export CLASSPATH=.:$JAVA_HOME/lib:$HADOOP_HOME/lib:$CLASSPATH export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
使环境变量当即生效注意在哪一个用户下执行该命令,环境变量在那个用户下生效
# su hadoop $ source /etc/profile
注意:hadoop_tmp文件夹必定要配置在存储空间比较大的位置,不然会报错
可能出现的问题:
(1)Unhealthy Nodes 问题
http://blog.csdn.net/korder/a...
(2)local-dirs turned bad
(3)Hadoop运行任务时一直卡在: INFO mapreduce.Job: Running job
http://www.bkjia.com/yjs/1030...
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/hadoop/cloud/hadoop/hadoop_tmp</value> <!--须要本身建立hadoop_tmp文件夹--> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hbase.rootdir</name> <value>hdfs://master:9000/hbase</value> </property> </configuration>
<configuration> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>master:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hadoop/cloud/hadoop/dfs/name</value> <description>namenode上存储hdfs元数据</description> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hadoop/cloud/hadoop/dfs/data</value> <description>datanode上数据块物理存储位置</description> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>
注:访问namenode的 webhdfs 使用50070端口,访问datanode的webhdfs使用50075端口。要想不区分端口,直接使用namenode的IP和端口进行全部webhdfs操做,就须要在全部
datanode上都设置hdfs-site.xml中dfs.webhdfs.enabled为true。
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> <property> <name>mapreduce.jobtracker.http.address</name> <value>NameNode:50030</value> </property> </configuration>
jobhistory是Hadoop自带一个历史服务器,记录Mapreduce历史做业。默认状况下,jobhistory没有启动,可用如下命令启动:
$ sbin/mr-jobhistory-daemon.sh start historyserver
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>master:2181,slave1L2181,slave2:2181</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> </configuration>
修改slaves
文件,添加datanode节点hostname到slaves文件中
slave1 slave2
vim /home/hadoop/cloud/hadoop/etc/hadoop/hadoop-env.sh export JAVA_HOME=${JAVA_HOME} -> export JAVA_HOME=/usr/java export HADOOP_COMMON_LIB_NATIVE_DIR=/home/hadoop/hadoop/lib/native
最后,将整个/home/hadoop/cloud/hadoop-2.6.5文件夹及其子文件夹使用scp复制到Slave相同目录中:
$ scp -r /home/hadoop/cloud/hadoop-2.6.5 hadoop@slave1:/home/hadoop/cloud/ $ scp -r /home/hadoop/cloud/hadoop-2.6.5 hadoop@slave2:/home/hadoop/cloud/
确保配置文件中各文件夹已经建立
$ hdfs namenode –format
成功后显示信息
************************************************************/ 17/09/09 04:27:03 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT] 17/09/09 04:27:03 INFO namenode.NameNode: createNameNode [-format] 17/09/09 04:27:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Formatting using clusterid: CID-243cecfb-c003-4213-8112-b5f227616e39 17/09/09 04:27:04 INFO namenode.FSNamesystem: No KeyProvider found. 17/09/09 04:27:04 INFO namenode.FSNamesystem: fsLock is fair:true 17/09/09 04:27:04 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 17/09/09 04:27:04 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true 17/09/09 04:27:04 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 17/09/09 04:27:04 INFO blockmanagement.BlockManager: The block deletion will start around 2017 Sep 09 04:27:04 17/09/09 04:27:04 INFO util.GSet: Computing capacity for map BlocksMap 17/09/09 04:27:04 INFO util.GSet: VM type = 64-bit 17/09/09 04:27:04 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB 17/09/09 04:27:04 INFO util.GSet: capacity = 2^21 = 2097152 entries 17/09/09 04:27:04 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false 17/09/09 04:27:04 INFO blockmanagement.BlockManager: defaultReplication = 2 17/09/09 04:27:04 INFO blockmanagement.BlockManager: maxReplication = 512 17/09/09 04:27:04 INFO blockmanagement.BlockManager: minReplication = 1 17/09/09 04:27:04 INFO blockmanagement.BlockManager: maxReplicationStreams = 2 17/09/09 04:27:04 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000 17/09/09 04:27:04 INFO blockmanagement.BlockManager: encryptDataTransfer = false 17/09/09 04:27:04 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000 17/09/09 04:27:04 INFO namenode.FSNamesystem: fsOwner = hadoop (auth:SIMPLE) 17/09/09 04:27:04 INFO namenode.FSNamesystem: supergroup = supergroup 17/09/09 04:27:04 INFO namenode.FSNamesystem: isPermissionEnabled = false 17/09/09 04:27:04 INFO namenode.FSNamesystem: HA Enabled: false 17/09/09 04:27:04 INFO namenode.FSNamesystem: Append Enabled: true 17/09/09 04:27:05 INFO util.GSet: Computing capacity for map INodeMap 17/09/09 04:27:05 INFO util.GSet: VM type = 64-bit 17/09/09 04:27:05 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB 17/09/09 04:27:05 INFO util.GSet: capacity = 2^20 = 1048576 entries 17/09/09 04:27:05 INFO namenode.NameNode: Caching file names occuring more than 10 times 17/09/09 04:27:05 INFO util.GSet: Computing capacity for map cachedBlocks 17/09/09 04:27:05 INFO util.GSet: VM type = 64-bit 17/09/09 04:27:05 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB 17/09/09 04:27:05 INFO util.GSet: capacity = 2^18 = 262144 entries 17/09/09 04:27:05 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 17/09/09 04:27:05 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0 17/09/09 04:27:05 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000 17/09/09 04:27:05 INFO namenode.FSNamesystem: Retry cache on namenode is enabled 17/09/09 04:27:05 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 17/09/09 04:27:05 INFO util.GSet: Computing capacity for map NameNodeRetryCache 17/09/09 04:27:05 INFO util.GSet: VM type = 64-bit 17/09/09 04:27:05 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB 17/09/09 04:27:05 INFO util.GSet: capacity = 2^15 = 32768 entries 17/09/09 04:27:05 INFO namenode.NNConf: ACLs enabled? false 17/09/09 04:27:05 INFO namenode.NNConf: XAttrs enabled? true 17/09/09 04:27:05 INFO namenode.NNConf: Maximum size of an xattr: 16384 17/09/09 04:27:05 INFO namenode.FSImage: Allocated new BlockPoolId: BP-706635769-192.168.32.100-1504902425219 17/09/09 04:27:05 INFO common.Storage: Storage directory /home/hadoop/cloud/hadoop/dfs/name has been successfully formatted. 17/09/09 04:27:05 INFO namenode.FSImageFormatProtobuf: Saving image file /home/hadoop/cloud/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression 17/09/09 04:27:05 INFO namenode.FSImageFormatProtobuf: Image file /home/hadoop/cloud/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds. 17/09/09 04:27:05 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 17/09/09 04:27:05 INFO util.ExitUtil: Exiting with status 0 17/09/09 04:27:05 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/192.168.32.100 ************************************************************/
$ start-dfs.sh $ start-yarn.sh //能够用一条命令来代替: $ start-all.sh
jps
查看进程(1)master
主节点进程:
8193 Jps 7943 ResourceManager 7624 NameNode 7802 SecondaryNameNode
(2)slave
数据节点进程:
1413 DataNode 1512 NodeManager 1626 Jps
概览:http://172.16.1.156:50070/
集群:http://172.16.1.156:8088/
JobHistory:http://172.16.1.156:19888
jobhistory是Hadoop自带一个历史服务器,记录Mapreduce历史做业。默认状况下,jobhistory没有启动,可用如下命令启动:
$ sbin/mr-jobhistory-daemon.sh start historyserver
运行wordcount
$ vi wordcount.txt hello you hello me hello everyone
$ hadoop fs -mkdir /data/wordcount $ hadoop fs –mkdir /output/
目录/data/wordcount用来存放Hadoop自带WordCount例子的数据文件,运行这个MapReduce任务结果输出到/output/wordcount目录中。
$ hadoop fs -put wordcount.txt/data/wordcount/
$ hadoop jar /home/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /data/wordcount /output/wordcount/
# hadoop fs -text /output/wordcount/part-r-00000 everyone 1 hello 3 me 1 you 1
在配置环境变量过程可能遇到输入命令ls命令不能识别问题:ls -bash: ls: command not found
缘由:在设置环境变量时,编辑profile文件没有写正确,将export PATH=$JAVA_HOME/bin:$PATH中冒号误写成分号 ,致使在命令行下ls等命令不可以识别。解决方案:export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
在主机上启动hadoop集群,而后使用jps查看主从机上进程状态,可以看到主机上的resourcemanager和各个从机上的nodemanager,可是过一段时间后,从机上的nodemanager就没有了,主机上的resourcemanager还在。
缘由是防火墙处于开启状态:
注:nodemanager启动后要经过心跳机制按期与RM通讯,不然RM会认为NM死掉,会中止NM服务。
sshd服务中设置了UseDNS yes,当配置的DNS服务器出现没法访问的问题,可能会形成链接该服务器须要等待10到30秒的时间。因为使用UseDNS,sshd服务器会反向解析链接客户端的ip,即便是在局域网中也会。
当平时链接都是很快,忽然变的异常的慢,多是sshd服务的服务器上配置的DNS失效,例如DNS配置的是外网的,而此时外面故障断开。终极解决方案是不要使用UseDNS,在配置文件/etc/sshd_config(有些linux发行版在/etc/ssh/sshd_config)中找到UseDNS 设置其值为 no,若是前面有#号,须要去掉,重启sshd服务器便可。
vim /etc/ssh/sshd_config UseDNS no
FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join java.io.IOException: There appears to be a gap in the edit log. We expected txid 176531929, but got txid 176533587.
缘由:是由于namenode和datenode数据不一致引发的
解决办法:删除master slave节点data
和name
文件夹下的内容,便可解决。缺点是数据不可恢复。
另外一种解决办法:http://blog.csdn.net/amber_am...
参考连接:
https://yq.aliyun.com/article...
https://taoistwar.gitbooks.io...
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
I assume you're running Hadoop on 64bit CentOS. The reason you saw that warning is the native Hadoop library $HADOOP_HOME/lib/native/libhadoop.so.1.0.0 was actually compiled on 32 bit.Anyway, it's just a warning, and won't impact Hadoop's functionalities.
http://stackoverflow.com/ques...
(1)简便的解决方法是:(后来我发现这两步都要作)
下载64位的库,解压到hadoop-2.7.0/lib/native/,不在有警告
下载地址:http://dl.bintray.com/sequenc...
(2)修改hadoop-env.sh
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=/usr/local/hadoop/lib/native" export HADOOP_COMMON_LIB_NATIVE_DIR="/usr/local/hadoop/lib/native/"
hadoop提交jar包卡住不会往下执行的解决方案,卡在此处:INFO mapreduce.Job: Running job: job_1474517485267_0001
这里咱们在集群的yarn-site.xml
中添加配置
<property> <name>yarn.nodemanager.resource.memory-mb</name> <value>4096</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>2048</value> </property> <property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>2.1</value> </property>
从新启动集群,运行jar包便可
可是,并无解决个人问题,个人问题是Unhealthy Nodes
,最后才发现!!可能不添加上述配置原来配置也是对的。
http://www.voidcn.com/blog/ga...
2017年1月22日, 星期日
update: 2017-06-02
增长操做系统基本设置部分
修改部分配置文件内容
update:2017.10.11迁移到segmentfault