搭建Hadoop集群: VirtualBox+Ubuntu 14.04+Hadoop2.6.0java
搭建好集群后, 在Mac下安装Eclipse并链接Hadoop集群git
添加Master的IP到Mac的hostsgithub
## # Host Database # # localhost is used to configure the loopback interface # when the system is booting. Do not change this entry. ## 127.0.0.1 localhost 255.255.255.255 broadcasthost ::1 localhost 192.168.56.101 Master # 添加Master的IP
Master下, 启动集群
Mac下, 打开http://master:50070/
可以成功访问, 看到集群的信息, 就能够了apache
Eclipse IDE for Java Developerssegmentfault
http://www.eclipse.org/downloads/package...app
可下载 Github 上的 hadoop2x-eclipse-plugin(备用下载地址:http://pan.baidu.com/s/1i4ikIoP)eclipse
在Applications中找个Eclise, 右键, Show Package Contentside
将插件复制到plugins目录下, 而后从新打开Eclipse就能够了oop
将Hadoop安装包解压到任何目录, 不用作任何配置, 而后在Eclipse中指向该目录便可ui
点击右上角的加号
添加Map/Reduce视图
选择Map/Reduce Locations, 而后右键, 选择New Hadoop location
须要改Location name, Host, DFS Master下的Port, User name ( Master会引用Mac中的hosts配置的IP ), 完成后, Finish
查看是否能够直接访问HDFS
File -> New -> Other -> Map/Reduce Project
输入项目名: WordCount, 而后点击, Finish
建立一个类, 报名org.apache.hadoop.examples, 类名: WordCount
复制下面的代码到WordCount.java中
package org.apache.hadoop.examples; import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.util.GenericOptionsParser; public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println("Usage: wordcount <in> <out>"); System.exit(2); } Job job = new Job(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
将全部修改过的配置文件和log4j.properties, 复制到src目标下
这里我复制了slaves, core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml
鼠标移动到WordCount.java上, 右键, Run As, Java Application
此时, 程序不会正常运行. 再次右键, Run As, 选择Run Configurations
填入输入输出路径 (空格分割)
配置完成后点击, Run. 此时会出现, Permission denied
没有权限访问HDFS
# 假设Mac的用户名为hadoop groupadd supergroup # 添加supergroup组 useradd -g supergroup hadoop # 添加hadoop用户到supergroup组 # 修改hadoop集群中hdfs文件的组权限, 使属于supergroup组的全部用户都有读写权限 hadoop fs -chmod 777 /
http://apache.claz.org/hadoop/common/had...
右上角的搜索框中, 搜索Open Type
输入NameNode, 选择NameNode, 发现看不了源码
点击Attach Source -> External location -> External Floder