CentOS6.5搭建hadoop伪分布式集群

搭建伪分布式集群:   1.ssh无密登陆     a.在家目录下建立.ssh文件夹,修改文件夹权限为700       $>mkdir ~/.ssh       $>chmod 700 ~/.ssh     b.生成公钥       $>ssh-keygen -t rsa -P '' -f ~/.ssh     c.将公钥添加至认证库,修改authorized_keys的权限为600       $>cat ~/.ssh/id_rsa.pub >> authorized_keys       $>chmod 600 ~/.ssh/authorized_keys     d.验证无密登陆本身是否成功       $>ssh localhost   2.安装jdk     a.准备工做:把jdk的**.tar.gz压缩包放在~/soft文件夹下.       确保系统中没有安装jdk,能够用rpm -qa|grep jdk查看,       若是有,先将jdk卸载(rpm -e --nodeps jdk**)。     b.解压jdk       $>tar -zxf jdk-8u171-linux-x64.tar.gz     c.建立jdk的软链接       $>ln -s jdk1.8.0_171/ jdk     d.配置环境变量 ~/.bash_profile       $>vim ~/.bash_profile       添加: # jdk install export JAVA_HOME=/home/hadoop/soft/jdk export PATH=$PATH:$JAVA_HOME/bin     e.使配置文件生效       $>source ~/.bash_profile     f.验证java是否配置成功       $>java -version       $>javac -version   3.安装hadoop     a.准备工做,将hadoop的tar.gz压缩文件放在~/soft文件夹下     b.解压hadoop,建立软链接       $>tar -zxf hadoop-2.7.3.tar.gz       $>ln -s hadoop-2.7.3/ hadoop     c.配置hadoop的环境变量       $>vim ~/.bash_profile       添加: # hadoop install export HADOOP_HOME=/home/hadoop/soft/hadoop export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin     d.使环境变量生效       $ source ~/.bash_profile     e.配置hadoop/etc/hadoop下的文件       1)hadoop-env.sh $ vim hadoop/etc/hadoop/hadoop-env.sh 注释原有的JAVA_HOME,从新导入JAVA_HOME         #export JAVA_HOME=${JAVA_HOME} export JAVA_HOME=/home/hadoop/soft/jkd       2)core-site.xml         $ vim hadoop/etc/hadoop/core-site.xml 添加: <property>   <name>fs.defaultFS</name>   <value>hdfs://localhost:9000</value>   <description>The name of the default file system.  A URI whose   scheme and authority determine the FileSystem implementation.  The   uri's scheme determines the config property (fs.SCHEME.impl) naming   the FileSystem implementation class.  The uri's authority is used to   determine the host, port, etc. for a filesystem.</description> </property> <property>   <name>hadoop.tmp.dir</name>   <value>/home/hadoop/tmp/hadoop</value>   <description>A base for other temporary directories.</description> </property>       3)hdfs-site.xml         $ vim hadoop/etc/hadoop/hdfs-site.xml 添加: <property>   <name>dfs.replication</name>   <value>1</value>   <description>Default block replication.   The actual number of replications can be specified when the file is created.   The default is used if replication is not specified in create time.   </description> </property> <property>   <name>dfs.namenode.name.dir</name>   <value>file:///home/hadoop/tmp/hadoop/dfs/name,file:///home/hadoop/tmp/hadoop/dfs/name1</value>   <description>Determines where on the local filesystem the DFS name node       should store the name table(fsimage).  If this is a comma-delimited list       of directories then the name table is replicated in all of the       directories, for redundancy. </description> </property> <property>   <name>dfs.datanode.data.dir</name>   <value>file:///home/hadoop/tmp/hadoop/dfs/data,file:///home/hadoop/tmp/hadoop/dfs/data1</value>   <description>Determines where on the local filesystem an DFS data node   should store its blocks.  If this is a comma-delimited   list of directories, then data will be stored in all named   directories, typically on different devices. The directories should be tagged   with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for HDFS   storage policies. The default storage type will be DISK if the directory does   not have a storage type tagged explicitly. Directories that do not exist will   be created if local filesystem permission allows.   </description> </property>       4)mapred-site.xml         $ cp hadoop/etc/hadoop/mapred-site.xml.template hadoop/etc/hadoop/mapred-site.xml $ vim hadoop/etc/hadoop/mapred-site.xml 添加:     <property> <name>mapreduce.framework.name</name> <value>yarn</value>     </property>       5)yarn-site.xml         $ vim hadoop/etc/hadoop/yarn-site.xml 添加:     <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value>     </property>     f.格式化namenode(只有第一次才格式化,以后启动hadoop都不须要)       $>hdfs namenode -format     e.验证.       开启hadoop       $> start.dfs.sh       查看进程:       $>jps       若是显示含有namenode datanode secondarynamenode说明hadoop伪分配置成功。