Hadoop是一个开源的分布式计算平台,用于存储大数据,并使用MapReduce来处理。Hadoop擅长于存储各类格式的庞大的数据,任意的格式甚至非结构化的处理。两个核心:前端
使用Hadoop能够轻松的组织计算机资源,从而搭建本身的分布式计算平台,而且能够充分利用集群的计算 和存储能力,完成海量数据的处理。java
由于我是用的是32位系统,官方预编译版本只有64位的,没法使用,因此得编译源代码。node
根据编译文件BUILDING.txt
内容,安装hadoop以前须要保证有如下工具:git
Hadoop编译说明书 须要: Unix 系统 JDK1.8 maven 3.3或更高 ProtoBuffer 2.5.0 CMake 3.1或更新(若是须要编译本地代码) Zlib develop(若是须要编译本地代码) openssl devel(若是编译原生hadoop-pipe,并得到最佳的HDFS加密性能) Linux FUSE(用户空间的文件系统) 2.6或更高(若是编译fuse_dfs) 第一次编译须要网络保持链接(获取全部的maven和Hadoop须要的依赖) Python(发布文档须要) bats(用于shell代码测试) Node.js / bower / Ember-cli(用于编译前端UI界面) --------------------------------------------------------------------- 得到具备全部工具的环境的最简单方法是经过Docker提供的配置。 这就须要一本最近的docker版本1.4.1或者更高的能够正常工做的版本 在Linux上,你能够运行下面的命名安装Docker $ ./start-build-env.sh 接下来显示的提示是位于源树的已安装版本,而且已安装和配置了全部必需的测试和构建工具。 请注意,在此docker环境中,您只能从您开始的位置访问Hadoop源树。所以若是你想运行 dev-support/bin/test-patch /path/to/my.patch 那么这个patch文件必须放在hadoop源树中。 在ubuntu中清楚并安装所需的软件包: Oracle JDK 1.8 (首选) $ sudo apt-get purge openjdk* $ sudo apt-get install software-properties-common $ sudo add-apt-repository ppa:webupd8team/java $ sudo apt-get update $ sudo apt-get install oracle-java8-installer Maven $ sudo apt-get -y install maven 本地依赖包 $ sudo apt-get -y install build-essential autoconf automake libtool cmake zlib1g-dev pkg-config libssl-dev ProtocolBuffer 2.5.0 (必须) $ sudo apt-get -y install protobuf-compiler
# 1.下载源码 wget https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.1.2/hadoop-3.1.2-src.tar.gz # 2.解压 tar -zxcf hadoop-3.1.2-src.tar.gz cd hadoop-3.1.2-src # 3.mvn编译 mvn package -Pdist,native -DskipTests -Dtar
编译这个玩意儿断断续续用了3天时间,下面是遇到的问题总结记录一下。web
问题1:docker
mvn package -Pdist,native -DskipTests -Dtar的时候编译失败:shell
[ERROR] Failed to execute goal org.codehaus.mojo:native-maven-plugin:1.0-alpha-8:javah (default) on project hadoop-common: Error running javah command: Error executing command line. Exit code:2 -> [Help 1]
解决:数据库
vim hadoop-common-project/hadoop-common/pom.xml
将javah的执行路径改成绝对路径apache
<javahPath>${env.JAVA_HOME}/bin/javah</javahPath> 改成 <javahPath>/usr/bin/javah</javahPath> # 具体的路径须要对应你机器上的真实路径
问题2:编程
mvn package -Pdist,native -DskipTests -Dtar的时候编译失败:
[ERROR] Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:3.1.2:cmake-compile (cmake-compile) on project hadoop-common: CMake failed with error code 1 -> [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:3.1.2:cmake-compile (cmake-compile) on project hadoop-common: CMake failed with error code 1
解决:
cmake版本不对,安装cmake3.0版本:
# download wget https://cmake.org/files/v3.0/cmake-3.0.0.tar.gz tar -zxvf cmake-3.0.0.tar.gz cd cmake-3.0.0 ./configure make sudo apt-get install checkinstall sudo checkinstall sudo make install # 创建软连接 sudo ln -s bin/* /usr/bin/
仍是不行。使用mvn package -Pdist,native -DskipTests -Dtar -e -X
打印全部日志,能够找到:
[INFO] Running cmake /home/wangjun/software/hadoop-3.1.2-src/hadoop-common-project/hadoop-common/src -DGENERATED_JAVAH=/home/wangjun/software/hadoop-3.1.2-src/hadoop-common-project/hadoop-common/target/native/javah -DJVM_ARCH_DATA_MODEL=32 -DREQUIRE_BZIP2=false -DREQUIRE_ISAL=false -DREQUIRE_OPENSSL=false -DREQUIRE_SNAPPY=false -DREQUIRE_ZSTD=false -G Unix Makefiles [INFO] with extra environment variables {} [WARNING] Soft-float JVM detected [WARNING] CMake Error at /home/wangjun/software/hadoop-3.1.2-src/hadoop-common-project/hadoop-common/HadoopCommon.cmake:182 (message): [WARNING] Soft-float dev libraries required (e.g. 'apt-get install libc6-dev-armel' [WARNING] on Debian/Ubuntu) [WARNING] Call Stack (most recent call first): [WARNING] CMakeLists.txt:26 (include) [WARNING] [WARNING] [WARNING] -- Configuring incomplete, errors occurred! [WARNING] See also "/home/wangjun/software/hadoop-3.1.2-src/hadoop-common-project/hadoop-common/target/native/CMakeFiles/CMakeOutput.log". [WARNING] See also "/home/wangjun/software/hadoop-3.1.2-src/hadoop-common-project/hadoop-common/target/native/CMakeFiles/CMakeError.log".
查看hadoop-common-project/hadoop-common/target/native/CMakeFiles/CMakeError.log
日志,看到报错:
gnu/stubs-soft.h: No such file or directory
解决方案:更改hadoop-common-project/hadoop-common/HadoopCommon.cmake
,将两处-mfloat-abi=softfp
改成-mfloat-abi=hard
,参考:https://blog.csdn.net/wuyushe...。(最好是从新解压原始包更改完从新编译,要否则可能会出错)
这个改完又有了新问题,编译Apache Hadoop MapReduce NativeTask
是报错
[WARNING] /usr/bin/ranlib libgtest.a [WARNING] make[2]: Leaving directory '/home/wangjun/software/hadoop-3.1.2-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native' [WARNING] /usr/local/bin/cmake -E cmake_progress_report /home/wangjun/software/hadoop-3.1.2-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native/CMakeFiles 1 [WARNING] [ 7%] Built target gtest [WARNING] make[1]: Leaving directory '/home/wangjun/software/hadoop-3.1.2-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native' [WARNING] /tmp/ccpXG9td.s: Assembler messages: [WARNING] /tmp/ccpXG9td.s:2040: Error: bad instruction `bswap r5' [WARNING] /tmp/ccpXG9td.s:2063: Error: bad instruction `bswap r1' [WARNING] make[2]: *** [CMakeFiles/nativetask.dir/build.make:79: CMakeFiles/nativetask.dir/main/native/src/codec/BlockCodec.cc.o] Error 1 [WARNING] make[2]: *** Waiting for unfinished jobs.... [WARNING] make[1]: *** [CMakeFiles/Makefile2:96: CMakeFiles/nativetask.dir/all] Error 2 [WARNING] make[1]: *** Waiting for unfinished jobs.... [WARNING] /tmp/ccBbS5rL.s: Assembler messages: [WARNING] /tmp/ccBbS5rL.s:1959: Error: bad instruction `bswap r5' [WARNING] /tmp/ccBbS5rL.s:1982: Error: bad instruction `bswap r1' [WARNING] make[2]: *** [CMakeFiles/nativetask_static.dir/build.make:79: CMakeFiles/nativetask_static.dir/main/native/src/codec/BlockCodec.cc.o] Error 1 [WARNING] make[2]: *** Waiting for unfinished jobs.... [WARNING] /tmp/cc6DHbGO.s: Assembler messages: [WARNING] /tmp/cc6DHbGO.s:979: Error: bad instruction `bswap r2' [WARNING] /tmp/cc6DHbGO.s:1003: Error: bad instruction `bswap r3' [WARNING] make[2]: *** [CMakeFiles/nativetask_static.dir/build.make:125: CMakeFiles/nativetask_static.dir/main/native/src/codec/Lz4Codec.cc.o] Error 1 [WARNING] make[1]: *** [CMakeFiles/Makefile2:131: CMakeFiles/nativetask_static.dir/all] Error 2 [WARNING] make: *** [Makefile:77: all] Error 2
看错误应该是指令问题,google一番后,找到解决方案:https://issues.apache.org/jir...。
编辑primitives.h
文件,根据https://issues.apache.org/jir...里面的git log修改后从新编译。
经历了3天的折磨,终于成功了!来,看当作功后的显示:
[INFO] No site descriptor found: nothing to attach. [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary for Apache Hadoop Main 3.1.2: [INFO] [INFO] Apache Hadoop Main ................................. SUCCESS [ 3.532 s] [INFO] Apache Hadoop Build Tools .......................... SUCCESS [ 6.274 s] [INFO] Apache Hadoop Project POM .......................... SUCCESS [ 3.668 s] [INFO] Apache Hadoop Annotations .......................... SUCCESS [ 5.743 s] [INFO] Apache Hadoop Assemblies ........................... SUCCESS [ 1.739 s] [INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [ 4.782 s] [INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [ 10.777 s] [INFO] Apache Hadoop MiniKDC .............................. SUCCESS [ 5.156 s] [INFO] Apache Hadoop Auth ................................. SUCCESS [ 18.468 s] [INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 8.293 s] [INFO] Apache Hadoop Common ............................... SUCCESS [03:15 min] [INFO] Apache Hadoop NFS .................................. SUCCESS [ 14.700 s] [INFO] Apache Hadoop KMS .................................. SUCCESS [ 15.340 s] [INFO] Apache Hadoop Common Project ....................... SUCCESS [ 0.876 s] [INFO] Apache Hadoop HDFS Client .......................... SUCCESS [ 46.540 s] [INFO] Apache Hadoop HDFS ................................. SUCCESS [02:34 min] [INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [ 12.125 s] [INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 20.005 s] [INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 8.934 s] [INFO] Apache Hadoop HDFS-RBF ............................. SUCCESS [01:08 min] [INFO] Apache Hadoop HDFS Project ......................... SUCCESS [ 0.892 s] [INFO] Apache Hadoop YARN ................................. SUCCESS [ 0.879 s] [INFO] Apache Hadoop YARN API ............................. SUCCESS [ 25.531 s] [INFO] Apache Hadoop YARN Common .......................... SUCCESS [01:57 min] [INFO] Apache Hadoop YARN Registry ........................ SUCCESS [ 14.521 s] [INFO] Apache Hadoop YARN Server .......................... SUCCESS [ 0.920 s] [INFO] Apache Hadoop YARN Server Common ................... SUCCESS [ 23.432 s] [INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 28.782 s] [INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [ 9.515 s] [INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [ 14.077 s] [INFO] Apache Hadoop YARN Timeline Service ................ SUCCESS [ 12.728 s] [INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 51.338 s] [INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [ 8.675 s] [INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 13.937 s] [INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [ 10.853 s] [INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [ 12.546 s] [INFO] Apache Hadoop YARN TimelineService HBase Backend ... SUCCESS [ 1.069 s] [INFO] Apache Hadoop YARN TimelineService HBase Common .... SUCCESS [ 17.176 s] [INFO] Apache Hadoop YARN TimelineService HBase Client .... SUCCESS [ 15.662 s] [INFO] Apache Hadoop YARN TimelineService HBase Servers ... SUCCESS [ 0.901 s] [INFO] Apache Hadoop YARN TimelineService HBase Server 1.2 SUCCESS [ 17.512 s] [INFO] Apache Hadoop YARN TimelineService HBase tests ..... SUCCESS [ 17.327 s] [INFO] Apache Hadoop YARN Router .......................... SUCCESS [ 14.430 s] [INFO] Apache Hadoop YARN Applications .................... SUCCESS [ 1.990 s] [INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [ 10.400 s] [INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [ 7.210 s] [INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [ 2.549 s] [INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 38.022 s] [INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 35.908 s] [INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [ 15.180 s] [INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 18.915 s] [INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 15.852 s] [INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 12.987 s] [INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 12.106 s] [INFO] Apache Hadoop YARN Services ........................ SUCCESS [ 1.812 s] [INFO] Apache Hadoop YARN Services Core ................... SUCCESS [ 8.685 s] [INFO] Apache Hadoop YARN Services API .................... SUCCESS [ 9.236 s] [INFO] Apache Hadoop YARN Site ............................ SUCCESS [ 0.859 s] [INFO] Apache Hadoop YARN UI .............................. SUCCESS [ 0.840 s] [INFO] Apache Hadoop YARN Project ......................... SUCCESS [ 34.971 s] [INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [ 7.376 s] [INFO] Apache Hadoop MapReduce NativeTask ................. SUCCESS [02:07 min] [INFO] Apache Hadoop MapReduce Uploader ................... SUCCESS [ 9.915 s] [INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 14.651 s] [INFO] Apache Hadoop MapReduce ............................ SUCCESS [ 15.959 s] [INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 11.747 s] [INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 16.314 s] [INFO] Apache Hadoop Archives ............................. SUCCESS [ 7.115 s] [INFO] Apache Hadoop Archive Logs ......................... SUCCESS [ 8.686 s] [INFO] Apache Hadoop Rumen ................................ SUCCESS [ 12.413 s] [INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 10.490 s] [INFO] Apache Hadoop Data Join ............................ SUCCESS [ 7.894 s] [INFO] Apache Hadoop Extras ............................... SUCCESS [ 7.098 s] [INFO] Apache Hadoop Pipes ................................ SUCCESS [ 19.457 s] [INFO] Apache Hadoop OpenStack support .................... SUCCESS [ 12.452 s] [INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [04:55 min] [INFO] Apache Hadoop Kafka Library support ................ SUCCESS [ 36.248 s] [INFO] Apache Hadoop Azure support ........................ SUCCESS [ 43.752 s] [INFO] Apache Hadoop Aliyun OSS support ................... SUCCESS [ 34.905 s] [INFO] Apache Hadoop Client Aggregator .................... SUCCESS [ 17.099 s] [INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 18.819 s] [INFO] Apache Hadoop Resource Estimator Service ........... SUCCESS [ 29.363 s] [INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [ 30.145 s] [INFO] Apache Hadoop Image Generation Tool ................ SUCCESS [ 8.970 s] [INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 46.265 s] [INFO] Apache Hadoop Tools ................................ SUCCESS [ 0.883 s] [INFO] Apache Hadoop Client API ........................... SUCCESS [08:41 min] [INFO] Apache Hadoop Client Runtime ....................... SUCCESS [06:39 min] [INFO] Apache Hadoop Client Packaging Invariants .......... SUCCESS [ 4.040 s] [INFO] Apache Hadoop Client Test Minicluster .............. SUCCESS [13:29 min] [INFO] Apache Hadoop Client Packaging Invariants for Test . SUCCESS [ 1.937 s] [INFO] Apache Hadoop Client Packaging Integration Tests ... SUCCESS [ 1.865 s] [INFO] Apache Hadoop Distribution ......................... SUCCESS [01:56 min] [INFO] Apache Hadoop Client Modules ....................... SUCCESS [ 5.050 s] [INFO] Apache Hadoop Cloud Storage ........................ SUCCESS [ 6.457 s] [INFO] Apache Hadoop Cloud Storage Project ................ SUCCESS [ 0.829 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 01:06 h [INFO] Finished at: 2019-09-03T14:14:45+08:00 [INFO] ------------------------------------------------------------------------
编译完成后的内容在hadoop-dist
里面。感觉一下为了编译这个玩意儿尝试了多少个版本:
cmake-3.0.0 cmake-3.3.0 hadoop-2.7.7-src.tar.gz hadoop-2.9.2-src hadoop-3.1.2-src protobuf-2.5.0 cmake-3.1.0 hadoop-2.8.5-src hadoop-2.9.2-src.tar.gz hadoop-3.1.2-src.tar.gz cmake-3.1.0.tar.gz hadoop-2.7.7-src hadoop-2.8.5-src.tar.gz hadoop-3.1.2 hadoop-3.1.2.tar.gz
将hadoop-dist/target
里面的hadoop-3.1.2.tar.gz
拷贝到你要安装的位置,解压。
# 进入bin目录,启动前先格式化HDFS系统 cd hadoop-3.1.2/bin ./hdfs namenode -format ...... ...... 2019-09-03 14:35:53,356 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at raspberrypi/127.0.1.1 ************************************************************/ # 启动全部服务 cd ../sbin/ ./start-all.sh WARNING: Attempting to start all Apache Hadoop daemons as wangjun in 10 seconds. WARNING: This is not a recommended production deployment configuration. WARNING: Use CTRL-C to abort. Starting namenodes on [raspberrypi] Starting datanodes Starting secondary namenodes [raspberrypi] Starting resourcemanager Starting nodemanagers
访问8088端口http://localhost:8088
就能够看到hadoop的管理界面了!
hadoop的web界面:
# All Applications http://localhost:8088 # DataNode Information http://localhost:9864 # Namenode Information http://localhost:9870 # node http://localhost:8042 # SecondaryNamenode information http://localhost:9868
问题1:启动时报错:
$ ./start-all.sh WARNING: Attempting to start all Apache Hadoop daemons as wangjun in 10 seconds. WARNING: This is not a recommended production deployment configuration. WARNING: Use CTRL-C to abort. Starting namenodes on [raspberrypi] raspberrypi: ERROR: JAVA_HOME is not set and could not be found. Starting datanodes localhost: ERROR: JAVA_HOME is not set and could not be found. Starting secondary namenodes [raspberrypi] raspberrypi: ERROR: JAVA_HOME is not set and could not be found. Starting resourcemanager Starting nodemanagers localhost: ERROR: JAVA_HOME is not set and could not be found.
解决方案:
vim ./etc/hadoop/hadoop-env.sh # export JAVA_HOME= 改成具体的java安装路径,好比 export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-armhf