hadoop学习笔记1

 Hadoop:
广义: 以hadoop软件为主的生态圈
狭义: hadoop软件java

hadoop.apache.org
hive.apache.org
spark.apache.org
flink.apache.orgnode

hadoop软件:
1.x
2.x 生产 2.6 
3.x linux

hadoop2.x组件:
hdfs: 存储 分布式文件系统  底层   生产
     hive/hbase
mapreduce: 分布式计算 --》开发难度高、计算慢(shuffle 磁盘)
     hive sql/spark
yarn: 资源(内存+core)+做业(job)调度管理系统  生产sql


但:
apache hadoop 不选择部署
企业通常选择CDH、Ambari、hdp部署
CDH: 
cloudera公司 将Apache hadoop-2.6.0源代码,
修复bug,新功能,编译为本身的版本cdh5.7.0express

Apache hadoop-2.6.0 --》hadoop-2.6.0-cdh5.7.0apache

部署:bash

1.添加sudo权限的无密码访问的hadoop用户
[root@hadoop002 ~]# useradd hadoop
[root@hadoop002 ~]# cat /etc/sudoers |grep hadoop
hadoop  ALL=(ALL)       NOPASSWD: ALL
[root@hadoop002 ~]# 
[root@hadoop002 ~]# su - hadoop
[hadoop@hadoop002 ~]$ app

2.下载
[hadoop@hadoop002 ~]$ mkdir app 
[hadoop@hadoop002 ~]$ cd app
[hadoop@hadoop002 app]$ wget http://archive-primary.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0.tar.gzless

[hadoop@hadoop002 app]$ tar -xzvf hadoop-2.6.0-cdh5.7.0.tar.gz
[hadoop@hadoop002 app]$ cd hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ dom


Required software for Linux include:
Java™ must be installed. Recommended Java versions are described at HadoopJavaVersions.
ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons.

3.JAVA1.7部署 
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ ll /usr/java/
total 319160
drwxr-xr-x 8 root root      4096 Apr 11  2015 jdk1.7.0_80
drwxr-xr-x 8 root root      4096 Apr 11  2015 jdk1.8.0_45
-rw-r--r-- 1 root root 153530841 Jul  8  2015 jdk-7u80-linux-x64.tar.gz
-rw-r--r-- 1 root root 173271626 Sep 19 11:49 jdk-8u45-linux-x64.gz
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ echo $JAVA_HOME
/usr/java/jdk1.7.0_80
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 


[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ which java
/usr/java/jdk1.7.0_80/bin/java
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ java -version
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 


4.准备
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ cd etc/hadoop
[hadoop@hadoop002 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_80
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hadoop
Usage: hadoop [--config confdir] COMMAND
       where COMMAND is one of:

启动三种模式
Local (Standalone) Mode: 单机 没有进程  不用
Pseudo-Distributed Mode: 伪分布式 1台机器 进程  学习
Fully-Distributed Mode: 分布式 进程  生产


5.配置文件
[hadoop@hadoop002 hadoop]$ vi core-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop002:9000</value>
    </property>
</configuration>
"core-site.xml" 24L, 884C written                                  
[hadoop@hadoop002 hadoop]$ vi hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>
"hdfs-site.xml" 23L, 866C written                                  
[hadoop@hadoop002 hadoop]$ cd

6.无密码ssh
[hadoop@hadoop002 hadoop]$ cd
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ rm -rf .ssh
[hadoop@hadoop002 ~]$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Created directory '/home/hadoop/.ssh'.
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
a3:c7:ba:e9:2e:77:ff:6f:50:bd:bc:f7:1b:1d:a6:e1 hadoop@hadoop002
The key's randomart image is:
+--[ DSA 1024]----+
|                 |
|                 |
|               . |
|              . .|
|        S    o.o.|
|       o .  o +oo|
|      . o    E .o|
|    . .+.     ..o|
|     =*o ....o..=|
+-----------------+
[hadoop@hadoop002 ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[hadoop@hadoop002 ~]$ cd .ssh
[hadoop@hadoop002 .ssh]$ ll
total 12
-rw-rw-r-- 1 hadoop hadoop 606 Sep 19 23:16 authorized_keys
-rw------- 1 hadoop hadoop 668 Sep 19 23:16 id_dsa
-rw-r--r-- 1 hadoop hadoop 606 Sep 19 23:16 id_dsa.pub

[hadoop@hadoop002 .ssh]$ chmod 600 authorized_keys
[hadoop@hadoop002 .ssh]$ 

[hadoop@hadoop002 .ssh]$ ssh hadoop002
The authenticity of host 'hadoop002 (172.31.236.240)' can't be established.
RSA key fingerprint is b1:94:33:ec:95:89:bf:06:3b:ef:30:2f:d7:8e:d2:4c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop002,172.31.236.240' (RSA) to the list of known hosts.
Last login: Wed Sep 19 18:21:09 2018 from 172.31.236.240

Welcome to Alibaba Cloud Elastic Compute Service !

[hadoop@hadoop002 ~]$ 


7.环境变量
[hadoop@hadoop002 ~]$ vi .bash_profile 
export MVN_HOME=/home/hadoop/app/apache-maven-3.3.9
export PROTOC_HOME=/home/hadoop/app/protobuf
export FINDBUGS_HOME=/home/hadoop/app/findbugs-1.3.9
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs
export JAVA_HOME=/usr/java/jdk1.7.0_80
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0

export PATH=$HADOOP_PREFIX/bin:$JAVA_HOME/bin:$PATH
~
~
".bash_profile" 12L, 293C written                                  
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ ssh hadoop002
Last login: Wed Sep 19 23:18:35 2018 from 172.31.236.240

Welcome to Alibaba Cloud Elastic Compute Service !

[hadoop@hadoop002 ~]$ which hdfs
~/app/hadoop-2.6.0-cdh5.7.0/bin/hdfs
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ cd ~/app/hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 

[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format

[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ jps
27707 SecondaryNameNode
27820 Jps
27432 NameNode


发现DN进程有问题,从新部署
[root@hadoop002 tmp]# rm -rf /tmp/hadoop-hadoop
[root@hadoop002 tmp]# 
[hadoop@hadoop002 hadoop]$ vi slaves 
hadoop002


[hadoop@hadoop002 hadoop]$ cd ../../
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format

[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
18/09/19 23:29:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop002]
hadoop002: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
hadoop002: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop002.out
18/09/19 23:29:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ jps
28288 NameNode
28686 Jps
28410 DataNode
28575 SecondaryNameNode
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 

云主机,开启防火墙
http://47.75.249.8:50070

中秋节做业: 1.join语法练习 2.hdfs部署 3.原创博客 更新到hdfs部署

相关文章
相关标签/搜索