使用 IDEA 搭建 Hadoop 3.1.1 项目

Hadoop 的版本是 3.1.1java

1. 启动 Hadoop 服务

$ start-all.sh
复制代码

2. 新建 IDEA 的 Maven 项目

2.1 选中 Maven,Project SDK 选择为 1.8,再点击 Next shell

点击 Next

2.2 填写好 GroupId,ArtifactId 后,点击 Next apache

2.3 点击 Finish 数组

image.png

3. 修改 Target bytecode version

打开 Setting,选中 Build, Execution, Deployment -> Compiler -> java,将 Target bytecode version 改成 1.8 或 8。bash

Target bytecode version

确认这几个配置下的 jdk 版本都为 1.8 app

Project SDK

Module SDK

4. 导入须要的 jar 包

4.1 选中 Dependencies 后点击下方的 + 号,选择「JARs or directories」 函数

添加 jar 包
JARs or directories

4.2 进入 Hadoop 目录下的 share/hadoop/ 中,把这几个包都导进去oop

share/hadoop/

选择 OK

继续 OK

4.2 在 pom.xml 中添加以下依赖学习

<dependencies>
        <!-- https://mvnrepository.com/artifact/junit/junit -->
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
            <scope>test</scope>
        </dependency>

        <!--&lt;!&ndash; https://mvnrepository.com/artifact/commons-logging/commons-logging &ndash;&gt;-->
        <dependency>
            <groupId>commons-logging</groupId>
            <artifactId>commons-logging</artifactId>
            <version>1.2</version>
        </dependency>

        <!--&lt;!&ndash; https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common &ndash;&gt;-->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>3.1.1</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-core</artifactId>
            <version>1.2.1</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>3.1.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.1.1</version>
        </dependency>
        
    </dependencies>
复制代码

5. 编写 Hadoop 项目的 Java 代码

5.1 新建 Java 类「Test.java」大数据

image.png

5.2 编写代码

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;

public class Test {

    // 在 HDFS 中新建一个 test 文件夹
    public static void main(String[] args) {

        FileSystem fileSystem = null;
        try {
            fileSystem = FileSystem.get(new URI("hdfs://localhost:9000/"),new Configuration(),"binguner");
            fileSystem.mkdirs(new Path("/test"));
            fileSystem.close();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (InterruptedException e) {
            e.printStackTrace();
        } catch (URISyntaxException e) {
            e.printStackTrace();
        }
    }

}
复制代码

5.3 运行 Java 程序

image.png

6. 运行结果

6.1 运行前的 HDFS 目录下没有 test 文件夹

6.2 运行后的 HDFS 目录下多了 test 文件夹

7. FileSystem 经常使用接口

  • 7.1 mkdirs
public boolean mkdirs(Path f) throws IOException {
    return this.mkdirs(f, FsPermission.getDirDefault());
}
复制代码

参数是新的文件夹的路径,能够在文件夹里嵌套文件夹进行建立。

  • 7.2 create
public FSDataOutputStream create(Path f) throws IOException {
        return this.create(f, true);
    }

    public FSDataOutputStream create(Path f, boolean overwrite) throws IOException {
        return this.create(f, overwrite, this.getConf().getInt("io.file.buffer.size", 4096), this.getDefaultReplication(f), this.getDefaultBlockSize(f));
    }

    public FSDataOutputStream create(Path f, Progressable progress) throws IOException {
        return this.create(f, true, this.getConf().getInt("io.file.buffer.size", 4096), this.getDefaultReplication(f), this.getDefaultBlockSize(f), progress);
    }

    public FSDataOutputStream create(Path f, short replication) throws IOException {
        return this.create(f, true, this.getConf().getInt("io.file.buffer.size", 4096), replication, this.getDefaultBlockSize(f));
    }

    public FSDataOutputStream create(Path f, short replication, Progressable progress) throws IOException {
        return this.create(f, true, this.getConf().getInt("io.file.buffer.size", 4096), replication, this.getDefaultBlockSize(f), progress);
    }

    public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize) throws IOException {
        return this.create(f, overwrite, bufferSize, this.getDefaultReplication(f), this.getDefaultBlockSize(f));
    }

    public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, Progressable progress) throws IOException {
        return this.create(f, overwrite, bufferSize, this.getDefaultReplication(f), this.getDefaultBlockSize(f), progress);
    }

    public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize) throws IOException {
        return this.create(f, overwrite, bufferSize, replication, blockSize, (Progressable)null);
    }

    public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException {
        return this.create(f, FsCreateModes.applyUMask(FsPermission.getFileDefault(), FsPermission.getUMask(this.getConf())), overwrite, bufferSize, replication, blockSize, progress);
    }

    public abstract FSDataOutputStream create(Path var1, FsPermission var2, boolean var3, int var4, short var5, long var6, Progressable var8) throws IOException;

    public FSDataOutputStream create(Path f, FsPermission permission, EnumSet<CreateFlag> flags, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException {
        return this.create(f, permission, flags, bufferSize, replication, blockSize, progress, (ChecksumOpt)null);
    }

    public FSDataOutputStream create(Path f, FsPermission permission, EnumSet<CreateFlag> flags, int bufferSize, short replication, long blockSize, Progressable progress, ChecksumOpt checksumOpt) throws IOException {
        return this.create(f, permission, flags.contains(CreateFlag.OVERWRITE), bufferSize, replication, blockSize, progress);
    }
复制代码

create 有多个重载函数,它的参数能够指定是否覆盖已有的文件、文件备份数量、写入文件缓冲区大小、文件块大小以及文件权限。它的返回值是一个 FSDataOutputStream,经过返回的 FSDataOutputStream 对象能够对文件进行写入。

  • 7.3 copyFromLocal
public void copyFromLocalFile(Path src, Path dst) throws IOException {
        this.copyFromLocalFile(false, src, dst);
    }

    public void copyFromLocalFile(boolean delSrc, Path src, Path dst) throws IOException {
        this.copyFromLocalFile(delSrc, true, src, dst);
    }

    public void copyFromLocalFile(boolean delSrc, boolean overwrite, Path[] srcs, Path dst) throws IOException {
        Configuration conf = this.getConf();
        FileUtil.copy(getLocal(conf), srcs, this, dst, delSrc, overwrite, conf);
    }
复制代码

将本地文件拷贝到文件系统,参数能够指定上传本地文件的路径,上传的多个路径组成的 Path 数组,存放目标对路径,能够指定是否删除本地本地的文件或者覆盖 hdfs 上已经建立的文件。

  • 7.4 copyToLocalFile
public void copyToLocalFile(Path src, Path dst) throws IOException {
        this.copyToLocalFile(false, src, dst);
    }

    public void copyToLocalFile(boolean delSrc, Path src, Path dst) throws IOException {
        this.copyToLocalFile(delSrc, src, dst, false);
    }
复制代码

将目标文件复制到本地指定路径,delSrc 参数指定移动文件后是否要删除源文件。

  • 7.6 moveToLocalFile
public void moveToLocalFile(Path src, Path dst) throws IOException {
        this.copyToLocalFile(true, src, dst);
    }
复制代码

将目标文件移动到指定路径,函数内部调用的是 copyToLocalFile

  • 7.6 exists
public boolean exists(Path f) throws IOException {
        try {
            return this.getFileStatus(f) != null;
        } catch (FileNotFoundException var3) {
            return false;
        }
    }
复制代码

输入一个路径,检查 HDFS 上是否存在这个路径,存在返回 true,不存在返回 false

  • 7.7 delete
public abstract boolean delete(Path var1, boolean var2) throws IOException;
复制代码

第一个参数是要删除的路径,第二个参数为 true 时,若是目标文件夹内有文件,会强制删除。

欢迎关注本文做者:

扫码关注并回复「干货」,获取我整理的千G Android、iOS、JavaWeb、大数据、人工智能等学习资源。

相关文章
相关标签/搜索