导入Hadoop相关源码,真是一件不容易的事情,各类错误,各类红,让你体验一下解决万里江山一片红的爽快!java
Hadoop源码:本人这里选择的是hadoop-2.7.1-src.tar.gz
下载地址:https://archive.apache.org/dist/hadoop/common/
JDK:2.7版本的Hadoop建议使用1.7的JDK。本人这里选择的是:jdk-7u80-windows-x64.exe
Eclipse:Oxygen.2 Release (4.7.2)
Maven:apache-maven-3.3.1.zip
下载地址:http://mirrors.shu.edu.cn/apache/maven/
历史下载地址:https://archive.apache.org/dist/maven/binaries/
libprotoc:protoc-2.5.0-win32.zip
推荐版本下载地址:https://github.com/protocolbuffers/protobuf/releases
2.5.0版本下载地址:https://github.com/protocolbuffers/protobuf/releases?after=v3.0.0-alpha-4.1node
本人这里安装的是jdk-7u80-windows-x64.exe,安装步骤忽略。git
直接解压就可使用。github
安装参见:Maven介绍及安装
最好将Maven的远程仓库地址设置成国内的仓库,这样下载速度会快一些。如下提供国内的远程仓库地址:web
阿里:shell
<mirror> <id>nexus-aliyun</id> <mirrorOf>*</mirrorOf> <name>Nexus aliyun</name> <url>http://maven.aliyun.com/nexus/content/groups/public</url> </mirror>
如今官方有不少版本,可是Hadoop2.7.1版本只能使用protoc2.5.0版本。
这个版本比较简单,解压以后只有两个文件,一个是执行文件:protoc.exe,一个是说明文件:readme.txt。以下图:apache
这里有两种方式添加环境变量:windows
第一:将文件解压到本身指定的目录,而后将路径添加到环境变量Path中。使用如下命令测试安装是否成功:api
protoc --version
以下图表示安装成功:bash
第二:将可执行文件protoc.exe直接放入Maven的bin目录中便可。
此可执行文件没有多余的依赖,只要系统可以找到此可执行文件执行便可。
安装好上面的软件,就能够开始进行源码导入的步骤了!
将Hadoop的源码解压到本身规划的目录,最好是根目录。
进入Hadoop源码中的hadoop-maven-plugins文件夹中,打开cmd命令窗口,执行以下命令:
mvn install
这个过程当中,会下载不少东西,会由于某些东西下载不成功而执行失败,重复执行此命令,看到以下界面,证实这个过程执行成功。
以下界面的错误就是为何必须使用libprotoc2.5.0的缘由了,本人使用3.3.0版本试过,不行,并且指明须要2.5.0版本。
在Hadoop源码的根目录打开cmd命令窗口,执行以下命令:
mvn eclipse:eclipse -DskipTests
出现以下界面即为成功,如不成功,继续执行上述命令便可。
为了方便管理,在Eclipse中建立一个目录用于存放Hadoop相关的源码。建立步骤以下图:
而后点击File->Import,以下图:
弹出对话框,在Maven中查找Existing Maven Projects,点击next,以下图:
出现以下图的界面,按图中操做便可,注意选择路径的时候,选择到源码的根路径,否则若是有其余项目,导入选择对勾的时候会很麻烦。
上面说的麻烦就是下图,若是你选到Hadoop源码的根目录,那么直接点击select All便可点击完成。
导入以后,本人的界面是这样的,以下图:
万里江山一片红,看到就头疼啊,可是观看源码是没问题的。
由上述生成导入Eclipse中目录的命令中能够看出,Hadoop的项目排序应该是以下这样的:
[INFO] Apache Hadoop Main [INFO] Apache Hadoop Project POM [INFO] Apache Hadoop Annotations [INFO] Apache Hadoop Project Dist POM [INFO] Apache Hadoop Assemblies [INFO] Apache Hadoop Maven Plugins [INFO] Apache Hadoop MiniKDC] [INFO] Apache Hadoop Auth [INFO] Apache Hadoop Auth Examples [INFO] Apache Hadoop Common [INFO] Apache Hadoop NFS [INFO] Apache Hadoop KMS [INFO] Apache Hadoop Common Project [INFO] Apache Hadoop HDFS [INFO] Apache Hadoop HttpFS [INFO] Apache Hadoop HDFS BookKeeper Journal [INFO] Apache Hadoop HDFS-NFS [INFO] Apache Hadoop HDFS Project [INFO] hadoop-yarn [INFO] hadoop-yarn-api [INFO] hadoop-yarn-common [INFO] hadoop-yarn-server [INFO] hadoop-yarn-server-common [INFO] hadoop-yarn-server-nodemanager [INFO] hadoop-yarn-server-web-proxy [INFO] hadoop-yarn-server-applicationhistoryservice [INFO] hadoop-yarn-server-resourcemanager [INFO] hadoop-yarn-server-tests [INFO] hadoop-yarn-client [INFO] hadoop-yarn-server-sharedcachemanager [INFO] hadoop-yarn-applications [INFO] hadoop-yarn-applications-distributedshell [INFO] hadoop-yarn-applications-unmanaged-am-launcher [INFO] hadoop-yarn-site [INFO] hadoop-yarn-registry [INFO] hadoop-yarn-project [INFO] hadoop-mapreduce-client [INFO] hadoop-mapreduce-client-core [INFO] hadoop-mapreduce-client-common [INFO] hadoop-mapreduce-client-shuffle [INFO] hadoop-mapreduce-client-app [INFO] hadoop-mapreduce-client-hs [INFO] hadoop-mapreduce-client-jobclient [INFO] hadoop-mapreduce-client-hs-plugins [INFO] Apache Hadoop MapReduce Examples [INFO] hadoop-mapreduce [INFO] Apache Hadoop MapReduce Streaming [INFO] Apache Hadoop Distributed Copy [INFO] Apache Hadoop Archives [INFO] Apache Hadoop Rumen [INFO] Apache Hadoop Gridmix [INFO] Apache Hadoop Data Join [INFO] Apache Hadoop Ant Tasks [INFO] Apache Hadoop Extras [INFO] Apache Hadoop Pipes [INFO] Apache Hadoop OpenStack support [INFO] Apache Hadoop Amazon Web Services support [INFO] Apache Hadoop Azure support [INFO] Apache Hadoop Client [INFO] Apache Hadoop Mini-Cluster [INFO] Apache Hadoop Scheduler Load Simulator [INFO] Apache Hadoop Tools Dist [INFO] Apache Hadoop Tools [INFO] Apache Hadoop Distribution
这里的两步是每一个项目都须要执行的。
将全部的项目修改pom.xml的继承关系进行从新赋予,让项目有统一的Group Id和version号。
以下图:打开pom文件从新选一下parent便可。
将Java Build Path中的Libraries里的JRE和tools.jar修改为本身的版本,本人这里是1.7.0_80,以下图所示:
修改完成Java Build Path以后修改Java Compiler,将其修改成对应的版本便可,本人这里依然是1.7版本。
以下图:将此xml文件的头挪到第一行。
具体信息可参见:xml文件错误之指令不容许匹配
hadoop-common项目中有一个错误,其中avsc文件是avro的模式文件,这里须要经过如下方式,生成相应的.java文件。
jar包:avro-tools-1.7.4.jar
下载地址:https://archive.apache.org/dist/maven/binaries/
进入源码根目录下的“hadoop-common-project\hadoop-common\src\test\avro”中,打开cmd执行以下命令:
java -jar <所在目录>\avro-tools-1.7.4.jar compile schema avroRecord.avsc ..\java #例如:本人这里将此jar包放入了F:\\bigdata\hadoop中 #相应的命令以下: java -jar F:\\bigdata\hadoop\avro-tools-1.7.4.jar compile schema avroRecord.avsc ..\java
右键单击Eclipse中的hadoop-common项目,而后refresh。若是refresh不成功,直接refresh出错源码文件所在的包,再不成功则重启Eclipse
进入源码根目录下的“hadoop-common-project\hadoop-common\src\test\proto”,打开cmd命令窗口,执行以下命令:
protoc --java_out=..\java *.proto
这里的protoc就是在上面下载的protoc程序。
右键单击Eclipse中的hadoop-common,而后refresh。若是refresh不成功,直接refresh出错源码文件所在的包,再不成功则重启Eclipse。
在eclipse中,右键单击hadoop-streaming项目,选择“Properties”,左侧栏选择Java Build Path,而后右边选择Source标签页,删除出错的路径。
点击“Link Source按钮”,选择被连接的目录为“<你的源代码根目录>/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf”,连接名可使用显示的(也能够随便取);
inclusion patterns中添加capacity-scheduler.xml,exclusion patters中添加**/*.java,这个信息与出错的那项同样;完毕后将出错的项删除。刷新hadoop-streaming项目。
作完上面的排错,还有不少错误,这些错误在pom.xml就能看见,以下图:
这些错误一样在Maven的Lifecycle Mapping中也能看到,以下图的位置:
上图是我处理完了错误,因此全是绿的了。
Eclipse中的Windows-Preferences,找到Maven-Lifecycle Mappings以下图:
上图红框中的路径中其实没有lifecycle-mapping-metadata.xml文件的,这个文件存放于Eclipse的安装目录中的一个jar包里,位置以下:
eclipse\plugins\org.eclipse.m2e.lifecyclemapping.defaults_xxxxxxxxxxxx.jar,以下图:
将此文件解压出来,放置到Change mapping file location所示的路径中去,而后添加缺失的插件,格式文件中都有。
这里是本人添加的一份文件。文件内容以下:
<?xml version="1.0" encoding="UTF-8"?> <lifecycleMappingMetadata> <lifecycleMappings> <lifecycleMapping> <packagingType>war</packagingType> <lifecycleMappingId>org.eclipse.m2e.jdt.JarLifecycleMapping</lifecycleMappingId> </lifecycleMapping> </lifecycleMappings> <pluginExecutions> <!-- standard maven plugins --> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-resources-plugin</artifactId> <goals> <goal>resources</goal> <goal>testResources</goal> <goal>copy-resources</goal> </goals> <versionRange>[2.4,)</versionRange> </pluginExecutionFilter> <action> <execute> <runOnIncremental>true</runOnIncremental> </execute> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-resources-plugin</artifactId> <goals> <goal>resources</goal> <goal>testResources</goal> <goal>copy-resources</goal> </goals> <versionRange>[0.0.1,2.4)</versionRange> </pluginExecutionFilter> <action> <error> <message>maven-resources-plugin prior to 2.4 is not supported by m2e. Use maven-resources-plugin version 2.4 or later.</message> </error> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-enforcer-plugin</artifactId> <goals> <goal>enforce</goal> </goals> <versionRange>[1.0-alpha-1,)</versionRange> </pluginExecutionFilter> <action> <ignore> <message>maven-enforcer-plugin (goal "enforce") is ignored by m2e.</message> </ignore> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-invoker-plugin</artifactId> <goals> <goal>install</goal> </goals> <versionRange>[1.6-SONATYPE-r940877,)</versionRange> </pluginExecutionFilter> <action> <ignore> <message>maven-invoker-plugin (goal "install") is ignored by m2e.</message> </ignore> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-remote-resources-plugin</artifactId> <versionRange>[1.0,)</versionRange> <goals> <goal>process</goal> </goals> </pluginExecutionFilter> <action> <ignore> <message>maven-remote-resources-plugin (goal "process") is ignored by m2e.</message> </ignore> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-eclipse-plugin</artifactId> <versionRange>[0,)</versionRange> <goals> <goal>configure-workspace</goal> <goal>eclipse</goal> <goal>clean</goal> <goal>to-maven</goal> <goal>install-plugins</goal> <goal>make-artifacts</goal> <goal>myeclipse</goal> <goal>myeclipse-clean</goal> <goal>rad</goal> <goal>rad-clean</goal> </goals> </pluginExecutionFilter> <action> <error> <message>maven-eclipse-plugin is not compatible with m2e</message> </error> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-source-plugin</artifactId> <versionRange>[2.0,)</versionRange> <goals> <goal>jar-no-fork</goal> <goal>test-jar-no-fork</goal> <!-- theoretically, the following goals should not be bound to lifecycle, but ignore them just in case --> <goal>jar</goal> <goal>aggregate</goal> <goal>test-jar</goal> </goals> </pluginExecutionFilter> <action> <ignore/> </action> </pluginExecution> <!--our add start******************************************************--> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-antrun-plugin</artifactId> <versionRange>[1.7,)</versionRange> <goals> <goal>run</goal> <goal>create-testdirs</goal> <goal>validate</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <versionRange>[3.1,)</versionRange> <goals> <goal>testCompile</goal> <goal>compile</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-jar-plugin</artifactId> <versionRange>[2.5,)</versionRange> <goals> <goal>test-compile</goal> <goal>test-jar</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-maven-plugins</artifactId> <versionRange>[2.7.1,)</versionRange> <goals> <goal>protoc</goal> <goal>compile-protoc</goal> <goal>generate-sources</goal> <goal>compile-test-protoc</goal> <goal>generate-test-sources</goal> <goal>version-info</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-plugin-plugin</artifactId> <versionRange>[3.4,)</versionRange> <goals> <goal>descriptor</goal> <goal>default-descriptor</goal> <goal>process-classes</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.avro</groupId> <artifactId>avro-maven-plugin</artifactId> <versionRange>[1.7.4,)</versionRange> <goals> <goal>schema</goal> <goal>generate-avro-test-sources</goal> <goal>generate-test-sources</goal> <goal>protocol</goal> <goal>default</goal> <goal>generate-sources</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.codehaus.mojo</groupId> <artifactId>exec-maven-plugin</artifactId> <versionRange>[1.3.1,)</versionRange> <goals> <goal>exec</goal> <goal>compile-ms-native-dll</goal> <goal>compile-ms-winutils</goal> <goal>compile</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.codehaus.mojo</groupId> <artifactId>native-maven-plugin</artifactId> <versionRange>[1.0-alpha-8,)</versionRange> <goals> <goal>javah</goal> <goal>compile</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <!--our add end****************************************--> <!-- commonly used codehaus plugins --> <pluginExecution> <pluginExecutionFilter> <groupId>org.codehaus.mojo</groupId> <artifactId>animal-sniffer-maven-plugin</artifactId> <versionRange>[1.0,)</versionRange> <goals> <goal>check</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.codehaus.mojo</groupId> <artifactId>buildnumber-maven-plugin</artifactId> <versionRange>[1.0-beta-1,)</versionRange> <goals> <goal>create</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> </pluginExecutions> </lifecycleMappingMetadata>
进行完上面的步骤以后,将全部的项目都Update一下,操做以下图:
通过上述的步骤以后,全部的问题应该都能解决了。
以上是本人导入源码的过程,基本上就这些错误,除了那三个典型的错误,还出现了多余的几个错误!
在运行源码的时候也出现了一些错误,后续会进行更新!
上一篇:Hadoop-MapReduce的shuffle过程及其余
下一篇: