搭建hadoop开发环境--基于xp+cygwin

1.安装cygwinhtml

 参考博文:http://hi.baidu.com/%BD%AB%D6%AE%B7%E7_%BE%B2%D6%AE%D4%A8/blog/item/8832551c7598551f314e15c2.html  java

       Q1.实际安装中在第9步 打开cygwin进行配置,首先输入:ssh-host-config.回车。会让你输入yes/no输入no。回车。见到Have fun!就说明成功了”有些不一样node

Administrator@03ad6b3ba2f34fe ~
$ ssh-host-config

*** Info: Generating /etc/ssh_host_key
*** Info: Generating /etc/ssh_host_rsa_key
*** Info: Generating /etc/ssh_host_dsa_key
*** Info: Generating /etc/ssh_host_ecdsa_key
*** Info: Creating default /etc/ssh_config file
*** Info: Creating default /etc/sshd_config file
*** Info: Privilege separation is set to yes by default since OpenSSH 3.3.
*** Info: However, this requires a non-privileged account called 'sshd'.
*** Info: For more info on privilege separation read /usr/share/doc/openssh/README.privsep.
*** Query: Should privilege separation be used? (yes/no) no
*** Info: Updating /etc/sshd_config file
*** Info: Added ssh to C:\WINDOWS\system32\driversc\services

*** Query: Do you want to install sshd as a service?
*** Query: (Say "no" if it is already installed as a service) (yes/no) yes
*** Query: Enter the value of CYGWIN for the daemon: []              --直接敲回车

*** Info: The sshd service has been installed under the LocalSystem
*** Info: account (also known as SYSTEM). To start the service now, call
*** Info: `net start sshd' or `cygrunsrv -S sshd'.  Otherwise, it
*** Info: will start automatically after the next reboot.

*** Info: Host configuration finished. Have fun!

      Q2. 第一次安装中电脑死机,当时执行到建立图标的步骤,已经能够运行了,可是仍是想重装一遍。因而找卸载办法,有人说用setup那个文件,把选中的都uninstall一下,我信了而后就悲剧了,卸不干净。而后找完美卸载的办法,尝试了一个"删除全部cygwin的文件夹,而后清理注册表中有cygwin的项" 此次OK了。千万别用setup去卸载!!apache

2.安装jdk和eclipse,这部分没有遇到问题,毕业java程序也写了1年多了windows

3.hadoop配置eclipse

      参考博文:http://hi.baidu.com/%BD%AB%D6%AE%B7%E7_%BE%B2%D6%AE%D4%A8/blog/item/a0ebb1db953a772033fa1c9a.htmlssh

       Q1.顺着博主的第四步./hadoop jar ./../hadoop-0.20.2-examples.jar wordcount testin testout的时候开始报错oop

INFO input.FileInputFormat: Total input paths to process : 2
INFO mapred.JobClient: Running job: job_201202131412_0007
INFO mapred.JobClient:  map 0% reduce 0%
INFO mapred.JobClient: Task Id : attempt_201202131412_0007_m_0             00003_0, Status : FAILED
java.io.FileNotFoundException: File D:/hadoop/temp/taskTracker/jobcache/job_2012             02131412_0007/attempt_201202131412_0007_m_000003_0/work/tmp does not exist.
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys             tem.java:361)
        at

没错,博文下留言的人就是俺。这个错误怎么看都是找不到文件,上网找到了一个解决办法,就是mapred-site.xml文件中修改
ui

<property>
  <name>mapred.child.tmp</name>
  <value>/hadoop/tmp</value>


后来的操做就一直OK了。this

4.经常使用的命令
ssh localhost  登陆
cd /cygdriver/d/hadoop-0.20.2 进入目录
ls  查看当前目录下的全部文件
在/cygdrive/d/hadoop-0.20.2/bin目录下
./start-all.sh    启动
./hadoop namenode -format 格式化一个新的HDFS
./start-all.sh  同时启动HDFS和MAP/Reduce
./hadoop dfs -mkdir testin 建立目录testin
./hadoop dfs -put /test/*.jav0a testin 把test目录下的java文件所有复制到testin中
./hadoop dfs -ls testin 查看testin中的全部文件
./hadoop dfs -rmr testout  删除testout文件夹
./hadoop jar ./../hadoop-0.20.2-examples.jar wordcount testin testout
./hadoop dfs -cat testout/part-r-00000 查看testout文件夹下的part-r-00000文件

================================

遗留的问题

1. 好多人的博客中都写到hadoop0.20.2版本会遇到不少问题,“在windows用cygwin配置hadoop环境的时候必定要选择0.19.2的版本”。这个我暂时没遇到,另外提供0.19.2的下载连接,须要的本身下载:http://archive.apache.org/dist/hadoop/core/hadoop-0.19.2/  我也上传到了csdn  或者能够留个邮箱我发给你

2. 在cygwin上跑起来没问题的wordCount,在eclipse下跑着总有问题,和最初遇到那个问题同样,找不到文件。这个还须要进一步解决

注.参考的文档:http://wildrain.iteye.com/blog/1164608

 

---低头拉车,抬头看路

相关文章
相关标签/搜索