1、环境配置html
1.CentOSjava
[root@master hadoop-2.7.2]# cat /etc/redhat-release CentOS Linux release 7.1.1503 (Core) [root@master hadoop]# uname -r 3.10.0-229.20.1.el7.x86_64
2.JDK(jdk8u51)node
[root@master hadoop]# java -version java version "1.8.0_51" Java(TM) SE Runtime Environment (build 1.8.0_51-b16) Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.htmllinux
3.Hadoop(2.7.2)web
http://hadoop.apache.org/releases.html#25+January%2C+2016%3A+Release+2.7.2+%28stable%29+availableapache
4.基本环境bash
服务器Esxi6.0上开了5台虚拟机,具体设置以下服务器
[root@master hadoop]# more /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.1.171 master.hadoop master 192.168.1.172 slave1.hadoop slave1 192.168.1.173 slave2.hadoop slave2 192.168.1.174 slave3.hadoop slave3 192.168.1.175 slave4.hadoop slave4
2、配置安装oracle
1.安装JDKapp
(1)下载JDK,利用前面给出的网址下载须要的JDK
(2)解压JDK,输入tar -zxvf jdk-8u51-linux-x64.gz 将JDK解压并放置在/softall/目录下
(3)配置环境变量,编辑/etc/profile
export JAVA_HOME=/softall/jdk1.8.0_51 export JRE_HOME=/softall/jdk1.8.0_51/jre export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
(4)检查
[root@master hadoop]# java -version java version "1.8.0_51" Java(TM) SE Runtime Environment (build 1.8.0_51-b16) Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
2.SSH 免密码登陆
Hadoop运行过程当中须要管理远端Hadoop守护进程,在Hadoop启动之后,NameNode是经过SSH(Secure Shell)来启动和中止各个DataNode上的各类守护进程的。这就必须在节点之间执行指令的时候是不须要输入密码的形式,因此咱们须要配置SSH运用无密码公钥认证的形式,这样NameNode使用SSH无密码登陆并启动DataName进程,一样原理,DataNode上也能使用SSH无密码登陆到NameNode。
PS:新手刚开始配置这个地方确定会迷糊,但若是懂了原理,其实就没那么难理解了。好比私人会所,通常状况下进入都须要凭会员卡进入(比如密码口令),可是为了提高用户体验,改用人脸识别技术做为门禁,那么会员提早在会所照相登记,下次来的时候,根据照相机获取到会员的面部信息,用上次采集的图像一比对,就认出来了,也不须要提供会员卡了,多省事。
在这里就是须要将每台服务器生成的公钥文件交给其余全部服务器保存下来,这样,全部服务器两两访问时就都不须要密码。
具体来讲,就是master和slave服务器都生成本身的公私钥对,并把公钥信息存放到其余机器的authorized_keys文件中。
(1)在每台机器上修改配置文件,容许SSH免密码登陆,去掉/etc/ssh/sshd_config其中3行的注释,每台服务器都要设置。
RSAAuthentication yes PubkeyAuthentication yes AuthorizedKeysFile .ssh/authorized_keys
(2)生成公私钥对,输入:
ssh-keygen -t rsa -P ''
直接回车生成的密钥对:id_rsa和id_rsa.pub,默认存储在"~/.ssh"目录下。
[root@master utils]# ssh-keygen -t rsa -P '' Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 9e:a5:0b:7a:87:49:45:13:ec:1b:6d:0f:c7:9a:7c:c3 root@master.hadoop The key's randomart image is: +--[ RSA 2048]----+ | ... | | + | | o o . | | + + o | | .S=.B | | ...++ E | | ..o+ . . | | .+... | | .. .. | +-----------------+
(3)将全部slave机器的公钥发送给master
在slave1机器上执行
scp ~/.ssh/id_rsa.pub root@master:~/.ssh/id_rsa.slave1.pub
同理将其余slave2,slave3,slave4的公钥文件也发送给master,在master机器上查看
[root@master .ssh]# ll total 32 -rw------- 1 root root 2000 Jun 6 16:46 authorized_keys -rw------- 1 root root 1675 Jun 6 16:26 id_rsa -rw-r--r-- 1 root root 400 Jun 6 16:26 id_rsa.pub -rw-r--r-- 1 root root 400 Jun 6 16:45 id_rsa.slave1.pub -rw-r--r-- 1 root root 400 Jun 6 16:45 id_rsa.slave2.pub -rw-r--r-- 1 root root 400 Jun 6 16:45 id_rsa.slave3.pub -rw-r--r-- 1 root root 400 Jun 6 16:46 id_rsa.slave4.pub -rw-r--r-- 1 root root 728 Jun 6 16:27 known_hosts
将全部的公钥文件追加到受权的key里面去。
[root@master .ssh]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys [root@master .ssh]# cat ~/.ssh/id_rsa.slave1.pub >> ~/.ssh/authorized_keys [root@master .ssh]# cat ~/.ssh/id_rsa.slave2.pub >> ~/.ssh/authorized_keys [root@master .ssh]# cat ~/.ssh/id_rsa.slave3.pub >> ~/.ssh/authorized_keys [root@master .ssh]# cat ~/.ssh/id_rsa.slave4.pub >> ~/.ssh/authorized_keys
能够经过查看文件,检验结果
[root@master .ssh]# more authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQClpmQd2fUgawiH+RDkgtZDViT98L1D8u8Jx44dv4gci1nNt0TQCoSHK43QnT5/5Ncf4h6II3oYN8o6TrnDF8PXKP2rR0HULmHMUQf0qy45pmM5oUCwbZ1mY ggB/v77WS9MM2IBcjlPaNb17jvFWvkVGP+zUTfkuv7XfK1RY0CvNl55MFQBB/TbaB8o/8KHVVN7XmUWiRB68cFmRiBiaBuY97IFMbDmADBA+4cHMGiZ9hYNzKw+61Hw4H+OlhVv5cuth24KlUL/cAed7f1Qh/ ToP6aVYfUxmgf9Jc4pAaAss44UNGg0O2RodHsbIenVtYS/T/13iGWjmLckW9aKAFwP root@master.hadoop ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQH6DbFni2J+A6iSA+fcLRdEOZn/HLFPGSjjkd0VqdXkGhakGGlskLNL2zr7f5nmJonPF64OKjW5fsvdmSRlPXnYhlYT/dF6hw4gYxQIksd5Cm1X2AB6B5C 3WpiRif3m8L0cd99X9EE55rx5hVD0UxMVK0AIAF6Ao1opra1jUm0r0r7ddPJBhClE5nN8b1LZf/QaQHkmWkjO4KqFN6+QrEoEoT2cGPTV08Z+yOsRcognP4eJuc5PnxpY0pCpznstqAsNfPCi4KJwwpGpQ3ZF pwBYpwhiTFatSc95qtY8ZaQncomjmiJeUCVHpVO+pegdR0J4rhV122U5/6kuWgA6Mt root@slave1.hadoop ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDMb6GS9lvOEffh9ntF9q/UeW8bJ/s3U5DMq+696YLSBV9e34cQJo7xcfZhOMHHmJb2/AgCWMkV7LF2y0/+YUzKJZvdrvHSJ3aHCbBnFJ44srNR+754ZDnyt hyaoyZnx+0bEOsdeIO7HRPuRFKRxma64V8hV7MqO8/K1a3sT9yz2VRcoL1huAyfG8zQPZ7nT0PrMowV3b4CPwdMTIHK6fUjteIBFLIy/CWPWKD2o7bEEh2rxfqhVEGaHi5+EN7Ztex0lmOYzuyBShUYnz4q8u C7EHCEdMBlH+E04SvUF8n/6KoUPEJ25kVlSM3aqyDDO6CHq9R58iYmODmp9bn2nzgF root@slave2.hadoop ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDCrK1J7k9m/x1xIdtE0aCuCWI96OmgZocJ99TvMDp75jzlnWNsDjGHKYIh0UalPzKqjXa8JPLvrJPvUSbKVIvO7CiitUMviPz/EPUZnTnDuZVEEV33nPTeT NZsdw/EAh+lIkwscdRXNtoLyzKgJwfeAbTvegiBP9XuHt6GMtvf+Syv7u4bpomIO905Ury08km+FHL+JbP0EUsfNEUHfIR/e7qBy+7Yt94dzeKvKxTu1Ar/HfCdg/LJIi98xA3b+eRfZ2V0ACHqPlperQ8duy qvBtbt06NMOdpx4S9T1RsgYW1Mo9B/vVt7wocBY0IePfQZ0SPL1N4DYijxz7LyIVa/ root@slave3.hadoop ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDD2LLrtBQVUzvKbtUfzjUSq7dnLBTLxLTfrGAEJ6eENdQh0iCEMLdNfgN4AIP8A8CrQWcjag9YylY7fgzcvykbJlbTX8qoGdVqu8sikGrTbBNpkM03ZwfEv f3PId4q5hByANvPdFKK9IDF6uzEkK07o89zJKgK8BcgKU7OIOyStUz5bxLnrarqgQXe0yeQq+8QdQWly2Ojc4wuiEuI2SHaXxUAHcdVoFYNqiBHWMv1PpK2mULsuvmE343nV6iSifRlv9+Atud2F9W0RidmV2 PZtlva9rXGrxoJxiWsz4A+vhud3l9TxHZMguBukpPZBJ14of1zT1n9bxlcQYPgvGOP root@slave4.hadoop
(4)修改受权key的权限
chmod 600 ~/.ssh/authorized_keys
查看结果
[root@master .ssh]# ll total 20 -rw-r--r-- 1 root root 400 Jun 6 16:26 authorized_keys -rw------- 1 root root 1675 Jun 6 16:26 id_rsa -rw-r--r-- 1 root root 400 Jun 6 16:26 id_rsa.pub -rw-r--r-- 1 root root 400 Jun 6 16:45 id_rsa.slave1.pub -rw-r--r-- 1 root root 400 Jun 6 16:45 id_rsa.slave2.pub -rw-r--r-- 1 root root 400 Jun 6 16:45 id_rsa.slave3.pub -rw-r--r-- 1 root root 400 Jun 6 16:46 id_rsa.slave4.pub -rw-r--r-- 1 root root 728 Jun 6 16:27 known_hosts [root@master .ssh]# chmod 600 authorized_keys [root@master .ssh]# ll total 20 -rw------- 1 root root 400 Jun 6 16:26 authorized_keys -rw------- 1 root root 1675 Jun 6 16:26 id_rsa -rw-r--r-- 1 root root 400 Jun 6 16:26 id_rsa.pub -rw-r--r-- 1 root root 400 Jun 6 16:45 id_rsa.slave1.pub -rw-r--r-- 1 root root 400 Jun 6 16:45 id_rsa.slave2.pub -rw-r--r-- 1 root root 400 Jun 6 16:45 id_rsa.slave3.pub -rw-r--r-- 1 root root 400 Jun 6 16:46 id_rsa.slave4.pub -rw-r--r-- 1 root root 728 Jun 6 16:27 known_hosts
(5)将authorized_keys文件复制到各个slave机器上
scp ~/.ssh/authorized_keys root@slave1:~/.ssh/ scp ~/.ssh/authorized_keys root@slave2:~/.ssh/ scp ~/.ssh/authorized_keys root@slave3:~/.ssh/ scp ~/.ssh/authorized_keys root@slave4:~/.ssh/
(6)在每台机器上登陆全部其余机器,由于在第一次登陆时,服务器会出现登陆提示
[root@slave4 .ssh]# ssh slave2 The authenticity of host 'slave2 (192.168.1.173)' can't be established. ECDSA key fingerprint is e2:4d:18:4c:61:a0:ca:35:82:82:89:82:21:cc:ca:70. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave2,192.168.1.173' (ECDSA) to the list of known hosts. Last login: Mon Jun 6 16:49:30 2016 from slave3.hadoop
输入yes后,之后登陆就能够实现免密码登陆了。
3.安装Hadoop
(1)下载Hadoop
官方下载地址:
http://hadoop.apache.org/releases.html#25+January%2C+2016%3A+Release+2.7.2+%28stable%29+available
PS:须要说明的是,官方提供的bin包是32位的。若是在64位系统上运行会报错,须要使用src包在64位系统上从新编译,后续会给出方法。此处安装64位的Hadoop仅仅多了编译这步,其它的安装方法和32位版本一致。
将下载的hadoop-2.7.2.tar.gz包放到 / 目录下
(2)解压Hadoop
将下载的hadoop解压到/softall/下(softall这个目录凭我的爱好命名,并提早建好)
tar zxvf /hadoop-2.7.2.tar.gz /softall/
在/softall/hadoop目录下建立数据存放的文件夹,tmp、logs、dfs/data、dfs/name
mkdir logs mkdir tmp mkdir -p dfs/data mkdir -p dfs/name
(3)配置Hadoop
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/softall/hadoop-2.7.2/tmp</value> </property> <property> <name>io.file.buffer.size</name> <value>131702</value> </property> <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.SnappyCodec </value> </property> </configuration>
<configuration> <!-- <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> --> <property> <name>mapreduce.job.tracker</name> <value>hdfs://master:8001</value> <final>true</final> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> <property> <name>mapreduce.map.output.compress</name> <value>true</value> </property> <property> <name>mapreduce.map.output.compress.codec</name> <value>org.apache.hadoop.io.compress.SnappyCodec</value> </property> </configuration>
PS:注释块的内容若是启用,nodemanager未启动的时候,运行wordcount例子时会卡主,后面会有说明。
<configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:/softall/hadoop-2.7.2/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/softall/hadoop-2.7.2/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>master:9001</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>768</value> </property> </configuration>
export JAVA_HOME=/softall/jdk1.8.0_51
192.168.1.172 192.168.1.173 192.168.1.174 192.168.1.175
scp -r /softall/hadoop-2.7.2 root@192.168.1.172:/softall/hadoop-2.7.2 scp -r /softall/hadoop-2.7.2 root@192.168.1.173:/softall/hadoop-2.7.2 scp -r /softall/hadoop-2.7.2 root@192.168.1.174:/softall/hadoop-2.7.2 scp -r /softall/hadoop-2.7.2 root@192.168.1.175:/softall/hadoop-2.7.2
4.运行验证Hadoop
(1)进入/softall/hadoop-2.7.2目录;
(2)初始化namenode节点:
bin/hadoop namenode -format
当出现successfully formated字样后表示已成功;
(3)启动hadoop
sbin/start-all.sh
(4)输入jps检查hadoop是否已启动
在master节点会显示
[root@master hadoop-2.7.2]# jps 6374 NameNode 31478 Jps 6619 SecondaryNameNode 6815 ResourceManager [root@master hadoop-2.7.2]#
在slave节点会显示:
[root@slave1 ~]# jps 27305 Jps 5390 DataNode [root@slave1 ~]#
这样就表示运行成功了。
3、在x64操做系统上编译hadoop
1.编译环境
操做系统:CentOS 7 64位(须要链接互联网)
Hadoop源代码版本:hadoop-2.7.2-src.tar.gz
2.编译准备
(1)安装JDK,参照上面的方法。
(2)安装软件包
须要说明的是,这些软件包最好仍是到官网或文中提供的连接下载。以前使用yum装的软件,结果编译报错,猜想多是版本问题。最后仍是删掉了,从新下载安装的。
yum -y install svn ncurses-devel gcc* yum -y install lzo-devel zlib-devel autoconf automake libtool cmake openssl–devel
下载连接:http://pan.baidu.com/s/1c1D6cow
依次输入命令完成安装与检查
tar zxvf protobuf-2.5.0.tar.gz cd protobuf-2.5.0 ./configure make make install protoc --version
下载连接:http://maven.apache.org/download.cgi
tar zxvf apache-maven-3.2.3-bin.tar.gz
配置环境变量,修改/etc/profile,添加下面内容:
export MAVEN_HOME=/usr/local/program/maven/apache-maven-3.2.3 export PATH=$PATH:$MAVEN_HOME/bin
执行source /etc/profile是环境变量生效,使用mvn -version检查是否安装成功
下载连接:http://pan.baidu.com/s/1byGZUm
解压并添加环境变量:
export ANT_HOME=/home/joywang/apache-ant-1.9.4 export PATH=$PATH:$ANT_HOME/bin
执行source /etc/profile是环境变量生效,使用ant -version检查是否安装成功
(3)编译Hadoop
解压Hadoop源码包
tar zxvf hadoop-2.7.2-src.tar.gz
进入hadoop-2.7.2-src目录,输入
mvn clean package –Pdist,native –DskipTests –Dtar
PS:若是须要用到hadoop-snappy,此处须要安装snappy,并在编译时加上参数。
下载地址:http://pan.baidu.com/s/1i49YcgH
yum install svn yum install autoconf automake libtool cmake yum install ncurses-devel yum install openssl-devel yum install gcc*
安装snappy
tar -zxvf snappy-1.1.3.tar.gz cd snappy-1.1.3/ ./configure make make install
带snappy编译:
mvn clean package -Pdist,native -DskipTests -Dtar -Dsnappy.lib=/usr/local/lib -Dbundle.snappy
其中-Dsnappy.lib=/usr/local/lib的路径是snappy默认安装路径,若是修改了,请写出实际安装路径。
而后就是等啊等。。。。
编译成功后,编译好的Hadoop为:
hadoop-2.7.2-src/hadoop-dist/target/hadoop-2.7.2.tar.gz
编译后的64位库在hadoop-2.7.2/lib/native下
[root@master native]# pwd /softall/hadoop-2.7.2/lib/native [root@master native]# ll total 5720 -rw-r--r-- 1 root root 1439746 Jun 3 16:23 libhadoop.a -rw-r--r-- 1 root root 1606968 Jun 3 16:23 libhadooppipes.a lrwxrwxrwx 1 root root 18 Jun 3 16:23 libhadoop.so -> libhadoop.so.1.0.0 -rwxr-xr-x 1 root root 829581 Jun 3 16:23 libhadoop.so.1.0.0 -rw-r--r-- 1 root root 475090 Jun 3 16:23 libhadooputils.a -rw-r--r-- 1 root root 433884 Jun 3 16:23 libhdfs.a lrwxrwxrwx 1 root root 16 Jun 3 16:23 libhdfs.so -> libhdfs.so.0.0.0 -rwxr-xr-x 1 root root 272298 Jun 3 16:23 libhdfs.so.0.0.0 -rw-r--r-- 1 root root 522304 Jun 3 16:23 libsnappy.a -rwxr-xr-x 1 root root 955 Jun 3 16:23 libsnappy.la lrwxrwxrwx 1 root root 18 Jun 3 16:23 libsnappy.so -> libsnappy.so.1.3.0 lrwxrwxrwx 1 root root 18 Jun 3 16:23 libsnappy.so.1 -> libsnappy.so.1.3.0 -rwxr-xr-x 1 root root 258613 Jun 3 16:23 libsnappy.so.1.3.0 [root@master native]#
(4)安装Hadoop
编辑/etc/profile文件,加入以下内容:
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
而后按照上面Hadoop的安装方法便可。
(5)验证
输入以下命令:
hadoop checknative -a
若出现下面提示,则表示安装成功。
[root@master target]# hadoop checknative -a 16/06/07 16:23:47 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native 16/06/07 16:23:47 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library Native library checking: hadoop: true /home/softall/hadoop-2.7.2/lib/native/libhadoop.so.1.0.0 zlib: true /lib64/libz.so.1 snappy: true /home/softall/hadoop-2.7.2/lib/native/libsnappy.so.1 lz4: true revision:99 bzip2: true /lib64/libbz2.so.1 openssl: true /lib64/libcrypto.so [root@master target]#
4、可能出现的问题
1.在运行hadoop命令后会出现
[root@master target]# hadoop checknative -a 16/06/07 16:19:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
代表编译的64位native库没有找到
可能缘由有如下几点:
(1)HADOOP_COMMON_LIB_NATIVE_DIR与HADOOP_OPTS变量设置与实际安装路径对不上,须要修改正确的路径;
(2)编译hadoop失败,查找问题从新编译;
(3)环境变量失效,source /etc/profile以后再次执行。出现这个错误是因为本人脚本中环境变量的顺序写错了,把export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native写到了export HADOOP_HOME=/softall/hadoop-2.7.2以前致使的。囧rz~~~
2.datanode没法启动
产生这个问题最大的缘由就是屡次初始化namenode,致使clusterID和datanode节点的clusterID匹配不上。
解决方法很简单,删除全部节点上的tmp、logs、dfs/data、dfs/name文件夹,而后从新初始化namenode
详见http://www.aboutyun.com/thread-7930-1-1.html
最后,很是感谢aboutyun上的pig2大神和如下各位大神的帮助,若有侵权,告知立删。
参考文献:
hadoop安装部署:http://www.open-open.com/lib/view/open1435761287778.html
hadoop编译:http://blog.csdn.net/Joy58061678/article/details/45746847
安装snappy包支持:http://blog.csdn.net/wzy0623/article/details/51263041