安装JDK,设置好环境变量:node
下载 hadoop-2.6.5apache
添加环境变量 HADOOP_HOMEapp
建立namenode及datanode目录,用来保存数据,oop
hadoop 相关配置文件设置,涉及到4个主要的配置文件:测试
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/E:/0_jly/hadoop-2.6.5/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/E:/0_jly/hadoop-2.6.5/datanode</value> </property> </configuration>
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>1024</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>4096</value> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>2</value> </property> </configuration>
格式化namenodecode
启动或中止hadooporm
查看mapreduce job:xml
查看hdfs 文件系统:blog
测试hadoop自带的wordcount进程
hdfs dfs -mkdir /input
/input 不带 / 放的地方就不是根目录
会放到 /user/Administrater/
hdfs dfs -put /E:/BaiduNetdiskDownload/1.txt /input
以下图能够看到你上传的文件
查看你启动的进程: