hadoop-2.6.0-src.tar.gz是源码压缩文件。
能够用eclipse导入研究源码,或者Maven构建编译打包。 hadoop-2.6.0.tar.gz是已经官方发布版压缩包,能够直接使用。
不过官网下载的hadoop发布版本只适合x86环境,若要x64的则须要Maven从新构建。
*.mds 是描述文件,记录压缩包的MD5,SHA1等信息。
[jiangzl@master hadoop]$ cat /etc/hostname html
masterjava
[jiangzl@master hadoop]$ cat /etc/hostsnode
192.168.1.114 masterpython
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4mysql
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6linux
[jiangzl@master hadoop]$ sql
(mac)apache
bogon:~ jiangzl$ vim
bogon:~ jiangzl$ cat /etc/hosts安全
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 localhost bogon
255.255.255.255 broadcasthost
::1 localhost
192.168.1.114 master
(window 7)
在window中配置:主机名对应的ip
C:\Windows\System32\drivers\etc\hosts
192.168.1.114 master
CentOS 7.0默认使用的是firewall做为防火墙,这里改成iptables防火墙。
firewall:
systemctl status firewalld.service #查看firewall状态
[jiangzl@localhost ~]$ systemctl status firewalld.service
firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled)
Active: active (running) since 四 2015-09-24 13:30:25 CST; 2 weeks 6 days ago
Main PID: 879 (firewalld)
CGroup: /system.slice/firewalld.service
└─879 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid
systemctl start firewalld.service #启动firewall
systemctl stop firewalld.service #中止firewall
[jiangzl@localhost ~]$ systemctl status firewalld.service
firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled)
Active: inactive (dead) since 三 2015-10-14 22:05:20 CST; 4min 12s ago
Process: 879 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
Main PID: 879 (code=exited, status=0/SUCCESS)
systemctl disable firewalld.service #禁止firewall开机启动
[jiangzl@localhost ~]$ systemctl status firewalld.service
firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled)
Active: inactive (dead)
ssh免密码登录
(1)执行命令ssh-keygen -t rsa (而后一路Enter) 产生秘钥位于 ~/.ssh/
(2)执行命令cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys 产生受权文件
(3)验证:ssh master (ssh 主机名)
[jiangzl@localhost ~]$ vim .bash_profile
#hadoop
export HADOOP_PREFIX=/home/jiangzl/work/hadoop
export HADOOP_HOME=/home/jiangzl/work/hadoop
export PATH=$PATH:$HADOOP_PREFIX/bin
# others
export JAVA_HOME=/home/jiangzl/work/jdk
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
[jiangzl@localhost ~]$ source .bash_profile
[jiangzl@master work]$ java -version
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
[jiangzl@master work]$
linux安装jdk后发现系统带有openjdk的处理 (直接卸载)
操做步骤:
http://jingyan.baidu.com/article/73c3ce28f0f68fe50343d9e1.html
官方文档:
http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SingleCluster.html
案例测试:
mkdir input
bin/hdfs dfs -put LICENSE.txt input
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount input output
hadoop fs -cat output/p*
6) JAVA_HOME is not set
[jiangzl@master hadoop]$ sbin/start-dfs.sh
Starting namenodes on [master]
master: Error: JAVA_HOME is not set and could not be found.
localhost: Error: JAVA_HOME is not set and could not be found.
Starting secondary namenodes [0.0.0.0]
0.0.0.0: Error: JAVA_HOME is not set and could not be found.
[jiangzl@master hadoop]$
Hadoop安装完后,启动时报Error: JAVA_HOME is not set and could not be found.
解决办法:
修改 etc/hadoop/hadoop-env.sh中设JAVA_HOME。
应当使用绝对路径。
export JAVA_HOME=$JAVA_HOME //错误,不能这么改
export JAVA_HOME=/usr/java/jdk //正确,应该这么改
Hadoop 伪分布模式下关机后,fs端口链接不上问题解决方案
老是报出异常以下所示:
13/07/24 09:14:24 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/07/24 09:14:25 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
解决方案:
1.首先删除 /tmp/hadoop-username/dfs/目录下的东西,并格式化namenode
2.属性hadoop.tmp.dir是hadoop文件系统依赖的基础配置,不少路径都依赖它。它默认的位置是在/tmp/{$user}下面,在local和hdfs都会建有相同的目录,可是在/tmp路径下的存储是不安全的,由于linux一次重启,文件就可能被删除。致使namenode启动不起来。所以须要在core-site.xml下 添加以下字段
<property>
<name>hadoop.tmp.dir</name>
<value>/home/jiangzl/work/hadoop/tmp</value>
</property>
3,注意关机前须要stop-all.sh
公益hive视频资料分享:
连接: http://pan.baidu.com/s/1jGKZKSe 密码: 1zty
hive课程中- hive视频-HIVE(1)安装mysql部分.avi (会讲hadoop的安装,经过官网)