【原】Kryo序列化篇

Kryo是一个快速有效的对象图序列化Java库。它的目标是快速、高效、易使用。该项目适用于对象持久化到文件或数据库中或经过网络传输。Kryo还能够自动实现深浅的拷贝/克隆。 就是直接复制一个对象对象到另外一个对象,而不是对象转换为字节而后转化为对象。java

目前已经被用在下列项目中:
KryoNet (NIO networking)
Twitter's Scalding (Scala API for Cascading)
Twitter's Chill (Kryo serializers for Scala)
Apache Fluo (Kryo is default serialization for Fluo Recipes)
Apache Hive (query plan serialization)
Apache Spark (shuffled/cached data serialization)
DataNucleus (JDO/JPA persistence framework)
CloudPelican
Yahoo's S4 (distributed stream computing)
Storm (distributed realtime computation system, in turn used by many others)
Cascalog (Clojure/Java data processing and querying details)
memcached-session-manager (Tomcat high-availability sessions)
Mobility-RPC (RPC enabling distributed applications)
akka-kryo-serialization (Kryo serializers for Akka)
Groupon
Jive
DestroyAllHumans (controls a robot!)
kryo-serializers (additional serializers)git

How

Kryo序列化实例:github

Kryo kryo = new Kryo();
    // ...
    Output output = new Output(new FileOutputStream("file.bin"));
    SomeClass someObject = ...
    kryo.writeObject(output, someObject);
    output.close();
    // ...
    Input input = new Input(new FileInputStream("file.bin"));
    SomeClass someObject = kryo.readObject(input, SomeClass.class);
    input.close();

Kryo默认为不一样的数据类型定义了不一样的序列化器,其具体类型以下所示:数据库

col1 col2 col3 col4 col5
boolean Boolean byte Byte char
Character short Short int Integer
long Long float Float double
Double byte[] String BigInteger BigDecimal
Collection Date Collections.emptyList Collections.singleton Map
StringBuilder TreeMap Collections.emptyMap Collections.emptySet KryoSerializable
StringBuffer Class Collections.singletonList Collections.singletonMap Currency
Calendar TimeZone Enum EnumSet

Kryo类在构造参数中会为上述表中的类型初始化对应的序列化类。apache

其它类会使用默认序列化器FieldSerializer,固然你也能够经过下面的代码更改默认序列化器:api

Kryo kryo = new Kryo();
kryo.setDefaultSerializer(AnotherGenericSerializer.class);

也能够为每一个类指定序列化类,以下网络

Kryo kryo = new Kryo();
kryo.register(SomeClass.class, new SomeSerializer());
kryo.register(AnotherClass.class, new AnotherSerializer());

Why

1.下面经过如下Kryo4.0源代码看一下
Kryo类的构造函数为:session

public Kryo (ClassResolver classResolver, ReferenceResolver referenceResolver, StreamFactory streamFactory) {
        if (classResolver == null) throw new IllegalArgumentException("classResolver cannot be null.");

        this.classResolver = classResolver;
        classResolver.setKryo(this);

        this.streamFactory = streamFactory;
        streamFactory.setKryo(this);

        this.referenceResolver = referenceResolver;
        if (referenceResolver != null) {
            referenceResolver.setKryo(this);
            references = true;
        }
//private final ArrayList<DefaultSerializerEntry> defaultSerializers = new ArrayList(33);用一个大小为33的Entry来存放类及序列化器
        addDefaultSerializer(byte[].class, ByteArraySerializer.class);
        addDefaultSerializer(char[].class, CharArraySerializer.class);
        addDefaultSerializer(short[].class, ShortArraySerializer.class);
        addDefaultSerializer(int[].class, IntArraySerializer.class);
        addDefaultSerializer(long[].class, LongArraySerializer.class);
        addDefaultSerializer(float[].class, FloatArraySerializer.class);
        addDefaultSerializer(double[].class, DoubleArraySerializer.class);
        addDefaultSerializer(boolean[].class, BooleanArraySerializer.class);
        addDefaultSerializer(String[].class, StringArraySerializer.class);
        addDefaultSerializer(Object[].class, ObjectArraySerializer.class);
        addDefaultSerializer(KryoSerializable.class, KryoSerializableSerializer.class);
        addDefaultSerializer(BigInteger.class, BigIntegerSerializer.class);
        addDefaultSerializer(BigDecimal.class, BigDecimalSerializer.class);
        addDefaultSerializer(Class.class, ClassSerializer.class);
        addDefaultSerializer(Date.class, DateSerializer.class);
        addDefaultSerializer(Enum.class, EnumSerializer.class);
        addDefaultSerializer(EnumSet.class, EnumSetSerializer.class);
        addDefaultSerializer(Currency.class, CurrencySerializer.class);
        addDefaultSerializer(StringBuffer.class, StringBufferSerializer.class);
        addDefaultSerializer(StringBuilder.class, StringBuilderSerializer.class);
        addDefaultSerializer(Collections.EMPTY_LIST.getClass(), CollectionsEmptyListSerializer.class);
        addDefaultSerializer(Collections.EMPTY_MAP.getClass(), CollectionsEmptyMapSerializer.class);
        addDefaultSerializer(Collections.EMPTY_SET.getClass(), CollectionsEmptySetSerializer.class);
        addDefaultSerializer(Collections.singletonList(null).getClass(), CollectionsSingletonListSerializer.class);
        addDefaultSerializer(Collections.singletonMap(null, null).getClass(), CollectionsSingletonMapSerializer.class);
        addDefaultSerializer(Collections.singleton(null).getClass(), CollectionsSingletonSetSerializer.class);
        addDefaultSerializer(TreeSet.class, TreeSetSerializer.class);
        addDefaultSerializer(Collection.class, CollectionSerializer.class);
        addDefaultSerializer(TreeMap.class, TreeMapSerializer.class);
        addDefaultSerializer(Map.class, MapSerializer.class);
        addDefaultSerializer(TimeZone.class, TimeZoneSerializer.class);
        addDefaultSerializer(Calendar.class, CalendarSerializer.class);
        addDefaultSerializer(Locale.class, LocaleSerializer.class);
        addDefaultSerializer(Charset.class, CharsetSerializer.class);
        addDefaultSerializer(URL.class, URLSerializer.class);
        OptionalSerializers.addDefaultSerializers(this);
        TimeSerializers.addDefaultSerializers(this);
        lowPriorityDefaultSerializerCount = defaultSerializers.size();

        // Primitives and string. Primitive wrappers automatically use the same registration as primitives.
        register(int.class, new IntSerializer());
        register(String.class, new StringSerializer());
        register(float.class, new FloatSerializer());
        register(boolean.class, new BooleanSerializer());
        register(byte.class, new ByteSerializer());
        register(char.class, new CharSerializer());
        register(short.class, new ShortSerializer());
        register(long.class, new LongSerializer());
        register(double.class, new DoubleSerializer());
        register(void.class, new VoidSerializer());
    }
  • Kryo:可看作类与序列化类的Map集合
  • ClassResolver :负责类的注册、根据类标识序列化、根据字节码获得类标识
  • ReferenceResolver接口 :当容许引用时,该类用于跟踪已经读写的对象,为写对象提供ID,根据ID读取对象。
    它有两个实现子类:ListReferenceResolver和MapReferenceResolver
  • StreamFactory接口 :基于系统配置提供输入和输出流。
    它有两个实现子类:DefaultStreamFactory和FastestStreamFactory

2.为何在使用一个类以前先要将Kryo注册一下?
在序列化时写一个类名是低效的,因此用一个ID来替代类名,只须要使用前完成注册便可,代码以下:app

//用先一个可用的整型ID来注册该类,若是类已经被注册,那么就返回以前的注册信息类
public Registration register (Class type) {
        Registration registration = classResolver.getRegistration(type);
        if (registration != null) return registration;
        return register(type, getDefaultSerializer(type));
    }

//和上面的基本同样,只不过是本身定义ID
public Registration register (Class type, int id) {
        Registration registration = classResolver.getRegistration(type);
        if (registration != null) return registration;
        return register(type, getDefaultSerializer(type), id);
    }

参考资料

QA

官方文档中有一段话:ide

Using Unsafe-based IO may result in a quite significant performance boost (sometimes up-to an order of magnitude), depending on your application. In particular, it helps a lot when serializing large primitive arrays as part of your object graphs.

下面是答案:

Unsafe-based IO uses methods from sun.misc.Unsafe for reading and writing from/to memory. Many of those methods map almost 1:1 to processor instructions. Moreover, reading/writing of arrays of native types is executed in bulk instead of doing it element-by-element as it is usually done by Kryo. Together, these features often provide a significant performance boost.

相关文章
相关标签/搜索