hadoop hdfs数据块探索

1.文件存储的位置node

示例查看apache

./bin/hadoop fsck /data/bb/bb.txt -files -blocks -racks –locationsoop

Image

blk_1076386829_2649976是meta文件名,具体如何找到这个meta文件,能够经过find命令,从图中咱们能够看到文件存储在117和229的二台机器上,例如咱们登陆到117机器上。3d

首先到dfs.datanode.data.dir的路径(若是忘记啦,能够在$HADOOP_HOME/etc/hadoop/hdfs-site.xml中查看)日志

个人机器配置以下:server

Image[7]

分别在3个目录中执行find语句,示例命令以下:xml

find /data1/hdfs1/data/current/BP-236683338-10.207.0.217-1403487328282/current -name blk_1076386829_2649976.meta  blog

最终找到meta文件。截图以下:ip

Image(1)

这样也就找到了你的文件,能够cat blk_1076386829查看 一下。hadoop

单纯的模拟了其中一个数据块损坏的状况,数据块损坏后,在该节点执行directoryscan以前(dfs.datanode.directoryscan.interval决定),都不会发现损坏,在向namenode报告数据块信息以前(dfs.blockreport.intervalMsec决定),都不会恢复数据块,当namenode收到块信息后才会采起恢复措施

真实的状况确定会更复杂,能够从这个简单的过程当中了解开头所说的两个参数。

参数配置

hdfs-site.xml中的两个主要参数配置入下

<property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>master:9001</value></property><property>
  <name>dfs.blockreport.intervalMsec</name>
    <value>600000</value>
      <description>Determines block reporting interval in milliseconds.</description></property><property>
  <name>dfs.datanode.directoryscan.interval</name>
    <value>600</value>    
</property>

都是10分钟

日志详情

2016-06-14 21:48:51,083 INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool BP-660628275-192.168.1.100-1464787466998 Total blocks: 1, missing metadata files:1, missing block files:1, missing blocks in memory:0, mismatched blocks:0
2016-06-14 21:48:51,084 WARN org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removed block 1073741825 from memory with missing block file on the disk
2016-06-14 21:49:17,168 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 1 blocks took 0 msec to generate and 1 msecs for RPC and NN processing
2016-06-14 21:49:17,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: sent block report, processed command:org.apache.hadoop.hdfs.server.protocol.FinalizeCommand@8a2db2
2016-06-14 21:49:20,977 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-660628275-192.168.1.100-1464787466998:blk_1073741825_1001 src: /192.168.1.101:53718 dest: /192.168.1.102:50010

2016-06-14 21:49:20,984 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received BP-660628275-192.168.1.100-1464787466998:blk_1073741825_1001 src: /192.168.1.101:53718 dest: /192.168.1.102:50010 of size 1366

相关文章
相关标签/搜索