Protobuf、Avro

使用场景

  • hadoop大文件块序列化传输
  • Spark-netty Shuffe机制序列化传输
  • 大文件块序列化传输,大Payload的RPC序列化

什么是序列化:将对象或者文件转换成0和1组成的字节数组,如[1,0,1,1,1,0],而后就能够用经过网络进行传输,不一样序列化工具算法数据结构不一样。性能,结果大小不一样,具体使用什么样的序列化须要结合场景。java

为何用 protobuf、Arvo

  • 对比JSON和Java原生,序列化性能更高,结果体积更小。适用于大数据场景,如Shuffe,RPC,数据容灾冗余。
  • 跨语言,若是你使用Java原生序列化,其余语言是没法反序列化的。

Protobuf

既然是工具,用起来其实很简单。使用步骤。git

  1. 定义以 .proto 为结尾的Schema文件,这个文件申明了被序列化数据的格式,反序列化端只需有一样内容.proto文件便可完成反序列化。
  2. 编译.proto为Java源码
  3. 使用Protobuf JavaSDK进行数据传输。

Protobuf编译器下载

github,下载本身系统对应版本便可。如我是Win64,就下载protoc-3.11.2-win64.zip,解压后将bin目录添加到环境变量。github

pom .proto2

<!--核心SDK-->
<dependency>
  <groupId>com.google.protobuf</groupId>
  <artifactId>protobuf-java</artifactId>
  <version>3.10.0</version>
</dependency>
<!--JSON工具-->
<dependency>
  <groupId>com.google.protobuf</groupId>
  <artifactId>protobuf-java-util</artifactId>
  <version>3.10.0</version>
</dependency>
复制代码

新建 addressbook.proto算法

idea新建Proto须要安装插件 ProtoBuf Support数组

#申明proto语法版本为2,hadoop3.X版本中用的是proto2
syntax = "proto2";

#申明proto文件所属的包,相似于java的包管理。
package tutorial;

#声明proto文件编译产生java文件输出包,输出java类名
option java_package = "com.example.tutorial";
option java_outer_classname = "AddressBookProtos";

#声明消息格式
message Person {
# required必需要求的,为空则会报错
# optional可选的
# repeated 可重复的,重复次数从0到无限
# 赋值为 1 或 2表明其重要程度,对于不经常使用的能够标记为更大,大到15
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    required string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phones = 4;
}

message AddressBook {
  repeated Person people = 1;
}
复制代码

编译生成Java源文件

protoc -I={你proto文件放置的文件夹,如E:\ideaproject\test\src\main\proto\} --java_out={java输出目录,如E:\ideaproject\test\src\main\java}  {proto文件位置,如E:\ideaproject\test\src\main\proto\addressbook}
复制代码

API操做

写数据,比较简单,API调用起来。bash

class Writer {
    // This function fills in a Person message based on user input.
    static AddressBookProtos.Person PromptForAddress(BufferedReader stdin, PrintStream stdout) throws IOException {
        AddressBookProtos.Person.Builder person = AddressBookProtos.Person.newBuilder();

        stdout.print("Enter person ID: ");
        person.setId(Integer.valueOf(stdin.readLine()));

        stdout.print("Enter name: ");
        person.setName(stdin.readLine());

        stdout.print("Enter email address (blank for none): ");
        String email = stdin.readLine();
        if (email.length() > 0) {
            person.setEmail(email);
        }

        while (true) {
            stdout.print("Enter a phone number (or leave blank to finish): ");
            String number = stdin.readLine();
            if (number.length() == 0) {
                break;
            }

            AddressBookProtos.Person.PhoneNumber.Builder phoneNumber =
                    AddressBookProtos.Person.PhoneNumber.newBuilder().setNumber(number);

            stdout.print("Is this a mobile, home, or work phone? ");
            String type = stdin.readLine();
            if (type.equals("mobile")) {
                phoneNumber.setType(AddressBookProtos.Person.PhoneType.MOBILE);
            } else if (type.equals("home")) {
                phoneNumber.setType(AddressBookProtos.Person.PhoneType.HOME);
            } else if (type.equals("work")) {
                phoneNumber.setType(AddressBookProtos.Person.PhoneType.WORK);
            } else {
                stdout.println("Unknown phone type. Using default.");
            }

            person.addPhones(phoneNumber);
        }

        return person.build();
    }

    // Main function: Reads the entire address book from a file,
    // adds one person based on user input, then writes it back out to the same
    // file.
    public static void main(String[] args) throws Exception {
        if (args.length != 1) {
            System.err.println("Usage: AddPerson ADDRESS_BOOK_FILE");
            System.exit(-1);
        }

        AddressBookProtos.AddressBook.Builder addressBook = AddressBookProtos.AddressBook.newBuilder();

        // Read the existing address book.
        try {
            addressBook.mergeFrom(new FileInputStream(args[0]));
        } catch (FileNotFoundException e) {
            System.out.println(args[0] + ": File not found. Creating a new file.");
        }

        // Add an address.
        addressBook.addPeople(
                PromptForAddress(new BufferedReader(new InputStreamReader(System.in)),
                        System.out));

        // Write the new address book back to disk.
        FileOutputStream output = new FileOutputStream(args[0]);
        addressBook.build().writeTo(output);
        output.close();
    }
}
复制代码

读数据,也是比较简单的。网络

import com.example.tutorial.AddressBookProtos.AddressBook;
import com.example.tutorial.AddressBookProtos.Person;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.PrintStream;

class ListPeople {
  // Iterates though all people in the AddressBook and prints info about them.
  static void Print(AddressBook addressBook) {
    for (Person person: addressBook.getPeopleList()) {
      System.out.println("Person ID: " + person.getId());
      System.out.println(" Name: " + person.getName());
      if (person.hasEmail()) {
        System.out.println(" E-mail address: " + person.getEmail());
      }

      for (Person.PhoneNumber phoneNumber : person.getPhonesList()) {
        switch (phoneNumber.getType()) {
          case MOBILE:
            System.out.print(" Mobile phone #: ");
            break;
          case HOME:
            System.out.print(" Home phone #: ");
            break;
          case WORK:
            System.out.print(" Work phone #: ");
            break;
        }
        System.out.println(phoneNumber.getNumber());
      }
    }
  }

  // Main function:  Reads the entire address book from a file and prints all
  //   the information inside.
  public static void main(String[] args) throws Exception {
    if (args.length != 1) {
      System.err.println("Usage: ListPeople ADDRESS_BOOK_FILE");
      System.exit(-1);
    }

    // Read the existing address book.
    AddressBook addressBook =
      AddressBook.parseFrom(new FileInputStream(args[0]));

    Print(addressBook);
  }
}
复制代码

protocbuf用起来仍是很是容易,只是须要额外的Proto文件描述数据的Schema。数据结构

Avro

基本和Protobuf一致,须要定义Schema文件,而后进行API操做。ide

相关文章
相关标签/搜索