HFTP Guide

Introduction(说明)java

HFTP is a Hadoop filesystem implementation that lets you read data from a remote Hadoop HDFS cluster. The reads are done via HTTP, and data is sourced from DataNodes. HFTP is a read-only filesystem, and will throw exceptions if you try to use it to write data or modify the filesystem state.node

HFTP是使hadoop文件系统从远程hdfs集群读取数据的一种实现,读取时经过http协议完成的,而且数据源来自于datanodes。HFTP时一种只读文件系统,而且会抛出异常若是你尝试经过他去写数据或者修改文件系统状态。apache

HFTP is primarily useful if you have multiple HDFS clusters with different versions and you need to move data from one to another. HFTP is wire-compatible even between different versions of HDFS. For example, you can do things like: hadoop distcp -i hftp://sourceFS:50070/src hdfs://destFS:8020/dest. Note that HFTP is read-only so the destination must be an HDFS filesystem. (Also, in this example, the distcp should be run using the configuraton of the new filesystem.)tcp

HFTP主要被用在若是你有多个不一样版本的HDFS集群,而且你须要从一个集群移动数据到另外一个集群时。HFTP时wire-compatible甚至在两个不一样的HDFS版本之间。例如,你能够像这样作一些事:hadoop distcp -i hftp://sourceFS:50070/src hdfs://destFS:8020/dest.注意HFTP是只读的而且目标端必须是一个HDFS文件系统。(所以,在这个例子中,dictcp应该被运行在使用了新文件系统配置的集权中。)oop

An extension, HSFTP, uses HTTPS by default. This means that data will be encrypted in transit.this

一个扩展,FSFTP,使用https协议,这意味着数据在传输过程当中被加密的。加密

Implementation(实现)code

The code for HFTP lives in the Java class org.apache.hadoop.hdfs.HftpFileSystem. Likewise, HSFTP is implemented in org.apache.hadoop.hdfs.HsftpFileSystem.ip

HFTP的代码编写在java类org.apache.hadoop.hdfs.HftpFileSystem.HSFTP的实现类是org.apache.hadoop.hdfs.HsftpFileSystem.hadoop

Configuration Options

Name

Description

dfs.hftp.https.port

the HTTPS port on the remote cluster. If not set, HFTP will fall back on dfs.https.port.

hdfs.service.host_ip:port

Specifies the service name (for the security subsystem) associated with the HFTP filesystem running at ip:port.

相关文章
相关标签/搜索