一 准备java
首先建立文件夹,结构以下:node
weim@weim:~/myopt$ ls ubuntu1 ubuntu2 ubuntu3
而且将下载的jdk(版本:8u172),hadoop(版本:hadoop-2.9.1)解压到三个文件夹中,以下:docker
weim@weim:~/myopt$ ls ubuntu1 hadoop jdk weim@weim:~/myopt$ ls ubuntu2 hadoop jdk weim@weim:~/myopt$ ls ubuntu3 hadoop jdk
二 准备三台机器ubuntu
这里使用docker建立三台机器,使用镜像ubuntu:16.04vim
weim@weim:~/myopt$ docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE ubuntu 16.04 f975c5035748 2 months ago 112MB
启动三台ubuntu容器,将分别将本地的/myopt/ubuntu1,/myopt/ubuntu2,/myopt/ubuntu3加载到容器的/home/software路径下。bash
ubuntu1app
weim@weim:~/myopt$ docker run --hostname ubuntu1 --name ubuntu1 -v /home/weim/myopt/ubuntu1:/home/software -it --rm ubuntu:16.04 bash root@ubuntu1:/# ls /home/software/ hadoop jdk
ubuntu2dom
weim@weim:~/myopt$ docker run --hostname ubuntu2 --name ubuntu2 -v /home/weim/myopt/ubuntu2:/home/software -it --rm ubuntu:16.04 bash root@ubuntu2:/# ls /home/software/ hadoop jdk
ubuntu3ssh
weim@weim:~/myopt$ docker run --hostname ubuntu3 --name ubuntu3 -v /home/weim/myopt/ubuntu3:/home/software -it --rm ubuntu:16.04 bash root@ubuntu3:/# ls /home/software/ hadoop jdk root@ubuntu3:/#
这样最基本的三台机器就建立好了。ide
查看机器的信息:
weim@weim:~$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b4c6de2a4326 ubuntu:16.04 "bash" About a minute ago Up About a minute ubuntu2 53d1f6389710 ubuntu:16.04 "bash" About a minute ago Up About a minute ubuntu3 0f210a01d47f ubuntu:16.04 "bash" About a minute ago Up About a minute ubuntu1 weim@weim:~$ weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu1 172.17.0.2 weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu2 172.17.0.4 weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu3 172.17.0.3 ---------------------------------------------------------------------------------- 这里是每台机器的ip地址 三台机器在同一个局域网内 ----------------------------------------------------------------------------------
三 安装必要的一些软件
在三台机器上安装必要的软件,首先执行apt-get update命令,更新ubuntu软件库。
而后安装软件vim,openssh-server软件便可。
四 环境配置
a 首先配置java环境,在文件下面追加java路径配置
root@ubuntu1:/home/software/jdk# vim /etc/profile --------------------------------------------------------------- 添加下面的配置到profile文件末尾处 #set jdk environment export JAVA_HOME=/home/software/jdk export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH --------------------------------------------------------------- root@ubuntu1:/home/software/jdk# source /etc/profile root@ubuntu1:/home/software/jdk# java -version java version "1.8.0_172" Java(TM) SE Runtime Environment (build 1.8.0_172-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode) root@ubuntu1:/home/software/jdk#
b 设置ssh无密码访问
root@ubuntu1:/home/software/jdk# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:hSMrNTp6/1d7L/QZGKdTCPivDJspbY2tcyjke2qjpBI root@ubuntu1 The key's randomart image is: +---[RSA 2048]----+ | . | | o . | | + o o . . | | o + o . o o | | + . S . * | | E . o . . .=.. | | o ..o . @..o..o| | . .o. * @.*. o..| | .. .++Xo+ . o.| +----[SHA256]-----+ root@ubuntu1:/home/software/jdk# cd ~/.ssh root@ubuntu1:~/.ssh# ls id_rsa id_rsa.pub root@ubuntu1:~/.ssh# cat id_rsa.pub >> authorized_keys root@ubuntu1:~/.ssh# chmod 600 authorized_keys
配置完成以后,经过ssh localhost验证是否能够无密码访问本机,首先确保ssh服务启动了。若是没有启动能够使用/etc/init.d/ssh start 启动服务。
root@ubuntu1:/home/software# /etc/init.d/ssh start * Starting OpenBSD Secure Shell server sshd [ OK ] root@ubuntu1:/home/software# ssh localhost The authenticity of host 'localhost (127.0.0.1)' can't be established. ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts. Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-41-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage The programs included with the Ubuntu system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. root@ubuntu1:~# exit logout Connection to localhost closed. root@ubuntu1:/home/software#
将authorized_keys文件拷贝到ubuntu2,ubuntu3容器中。(这里呢,我并不知道ubuntu2 root的密码,因此暂时不知道怎么经过scp命令拷贝过去)经过一种折中的方式便可。
首先进入~/.ssh文件下,将authorized_keys文件拷贝到/home/software路径下。
root@ubuntu1:~/.ssh# ls authorized_keys id_rsa id_rsa.pub known_hosts root@ubuntu1:~/.ssh# cp authorized_keys /home/software/ root@ubuntu1:~/.ssh# ls /home/software/ authorized_keys hadoop jdk root@ubuntu1:~/.ssh#
而后回到本地系统,在~/myopt/ubuntu1路径下能够看到刚才拷贝的文件,将该文件拷贝到ubuntu2,ubuntu3中。
weim@weim:~/myopt/ubuntu1$ ls authorized_keys hadoop jdk weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu2/ weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu3/
而后分别回到ubuntu2,ubuntu3容器中,将文件拷贝到~/.ssh目录下。
root@ubuntu2:/home/software# cp authorized_keys ~/.ssh root@ubuntu2:/home/software# ls ~/.ssh authorized_keys id_rsa id_rsa.pub root@ubuntu2:/home/software#
验证ubuntu1是否能够无密码访问ubuntu2,ubuntu3(可查看ip经过)
root@ubuntu1:~/.ssh# ssh root@172.17.0.3 root@ubuntu1:~/.ssh# ssh root@172.17.0.4
五 hadoop 环境配置
咱们以ubuntu1为例,2和3都是雷同的。
首先建立hadoop的数据保存目录。
root@ubuntu1:/home/software/hadoop# mkdir data root@ubuntu1:/home/software/hadoop# cd data/ root@ubuntu1:/home/software/hadoop/data# mkdir tmp root@ubuntu1:/home/software/hadoop/data# mkdir data root@ubuntu1:/home/software/hadoop/data# mkdir checkpoint root@ubuntu1:/home/software/hadoop/data# mkdir name
进入/home/software/hadoop/etc/hadoop目录
修改hadoop-env.sh文件,设置java
export JAVA_HOME=/home/software/jdk
配置core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://172.17.0.2:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/data/tmp</value> </property> <property> <name>fs.trash.interval</name> <value>1440</value> </property> <property> <name>io.file.buffer.size</name> <value>65536</value> </property> </configuration>
配置hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/home/hadoop/data/name</value> </property> <property> <name>dfs.blocksize</name> <value>67108864</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/home/hadoop/data/data</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>/home/hadoop/data/checkpoint</value> </property> <property> <name>dfs.namenode.handler.count</name> <value>10</value> </property> <property> <name>dfs.datanode.handler.count</name> <value>10</value> </property> <!--<property> <name>dfs.namenode.rpc-address</name> <value>172.17.0.2:9000</value> </property>--> </configuration>
配置mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
配置yarn-site.xml
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>172.17.0.2</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
配置slaves
172.17.0.2 172.17.0.3 172.17.0.4
六 启动
在ubuntu1中,进入/home/software/hadoop/bin目录,执行hdfs namenode -format 初始化hdfs
root@ubuntu1:/home/software/hadoop/bin# ./hdfs namenode -format
在ubuntu1中,进入/home/software/hadoop/sbin目录,
执行start-all.sh
root@ubuntu1:/home/software/hadoop/sbin# ./start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh Starting namenodes on [ubuntu1] The authenticity of host 'ubuntu1 (172.17.0.2)' can't be established. ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY. Are you sure you want to continue connecting (yes/no)? yes ubuntu1: Warning: Permanently added 'ubuntu1,172.17.0.2' (ECDSA) to the list of known hosts. ubuntu1: starting namenode, logging to /home/software/hadoop/logs/hadoop-root-namenode-ubuntu1.out 172.17.0.2: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu1.out 172.17.0.4: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu2.out 172.17.0.3: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu3.out Starting secondary namenodes [0.0.0.0] The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established. ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY. Are you sure you want to continue connecting (yes/no)? yes 0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts. 0.0.0.0: starting secondarynamenode, logging to /home/software/hadoop/logs/hadoop-root-secondarynamenode-ubuntu1.out starting yarn daemons starting resourcemanager, logging to /home/software/hadoop/logs/yarn--resourcemanager-ubuntu1.out 172.17.0.2: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu1.out 172.17.0.3: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu3.out 172.17.0.4: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu2.out
查看启动状况
ubuntu1
root@ubuntu1:/home/software/hadoop/sbin# jps 3827 SecondaryNameNode 3686 DataNode 4007 ResourceManager 4108 NodeManager 4158 Jps
ubuntu2
root@ubuntu2:/home/software/hadoop/sbin# jps 3586 Jps 3477 DataNode 3545 NodeManager
ubuntu3
root@ubuntu3:/home/software/hadoop/sbin# jps 3472 DataNode 3540 NodeManager 3582 Jps
接下来咱们访问http://172.17.0.2:50070 和http://172.17.0.2:8088就能够看到一些信息了。