java ArrayList的序列化分析

时间 2019-11-09

原文原文链接

1、绪论

所谓的JAVA序列化与反序列化，序列化就是将JAVA 对象以一种的形式保持，好比存放到硬盘，或是用于传输。反序列化是序列化的一个逆过程。html

JAVA规定被序列化的对象必须实现java.io.Serializable这个接口，而咱们分析的目标ArrayList一样实现了该接口。java

经过对ArrayList源码的分析，能够知道ArrayList的数据存储都是依赖于elementData数组，它的声明为：数组

transient Object[] elementData;

注意transient修饰着elementData这个数组。

一、先看看transient关键字的做用

咱们都知道一个对象只要实现了Serilizable接口，这个对象就能够被序列化，java的这种序列化模式为开发者提供了不少便利，咱们能够没必要关系具体序列化的过程，只要这个类实现了Serilizable接口，这个类的全部属性和方法都会自动序列化。缓存

然而在实际开发过程当中，咱们经常会遇到这样的问题，这个类的有些属性须要序列化，而其余属性不须要被序列化，打个比方，若是一个用户有一些敏感信息（如密码，银行卡号等），为了安全起见，不但愿在网络操做（主要涉及到序列化操做，本地序列化缓存也适用）中被传输，这些信息对应的变量就能够加上 transient关键字。换句话说，这个字段的生命周期仅存于调用者的内存中而不会写到磁盘里持久化。安全

总之，java 的transient关键字为咱们提供了便利，你只须要实现Serilizable接口，将不须要序列化的属性前添加关键字transient，序列化对象的时候，这个属性就不会序列化到指定的目的地中。网络

具体详见：Java transient关键字使用小记app

既然elementData被transient修饰，按理来讲，它不能被序列化的，那么ArrayList又是如何解决序列化这个问题的呢？源码分析

2、序列化工做流程

类经过实现java.io.Serializable接口能够启用其序列化功能。要序列化一个对象，必须与必定的对象输出／输入流联系起来，经过对象输出流将对象状态保存下来，再经过对象输入流将对象状态恢复。this

在序列化和反序列化过程当中须要特殊处理的类必须使用下列准确签名来实现特殊方法：.net

private void writeObject(java.io.ObjectOutputStream out) throws IOException

private void readObject(java.io.ObjectInputStream in) throws IOException, ClassNotFoundException

一、对象序列化步骤

a) 写入

首先建立一个OutputStream输出流；
而后建立一个ObjectOutputStream输出流，并传入OutputStream输出流对象；
最后调用ObjectOutputStream对象的writeObject()方法将对象状态信息写入OutputStream。

b)读取

首先建立一个InputStream输入流；
而后建立一个ObjectInputStream输入流，并传入InputStream输入流对象；
最后调用ObjectInputStream对象的readObject()方法从InputStream中读取对象状态信息。

举例说明：

public class Box implements Serializable {
    private static final long serialVersionUID = -3450064362986273896L;
    
    private int width;
    private int height;
    
    public static void main(String[] args) {
        Box myBox=new Box();
        myBox.setWidth(50);
        myBox.setHeight(30);
        try {
            FileOutputStream fs=new FileOutputStream("F:\\foo.ser");
            ObjectOutputStream os=new ObjectOutputStream(fs);
            os.writeObject(myBox);
            os.close();
            FileInputStream fi=new FileInputStream("F:\\foo.ser");
            ObjectInputStream oi=new ObjectInputStream(fi);
            Box box=(Box)oi.readObject();
            oi.close();
            System.out.println(box.height+","+box.width);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
    
    public int getWidth() {
        return width;
    }
    public void setWidth(int width) {
        this.width = width;
    }
    public int getHeight() {
        return height;
    }
    public void setHeight(int height) {
        this.height = height;
    }
}

3、ArrayList解决序列化

一、序列化

从上面序列化的工做流程能够看出，要想序列化对象，使用ObjectOutputStream对象输出流的writeObject()方法写入对象状态信息，便可使用readObject()方法读取信息。

那是否是能够在ArrayList中调用ObjectOutputStream对象的writeObject()方法将elementData的值写入输出流呢？

见源码：

private void writeObject(java.io.ObjectOutputStream s) throws java.io.IOException
{
    // Write out element count, and any hidden stuff
    int expectedModCount = modCount;
    s.defaultWriteObject();
    // Write out size as capacity for behavioural compatibility with clone()
    s.writeInt(size);
    // Write out all elements in the proper order.
    for (int i = 0; i < size; i++)
    {
        s.writeObject(elementData[i]);
    }
    if (modCount != expectedModCount)
    {
        throw new ConcurrentModificationException();
    }
}

虽然elementData被transient修饰，不能被序列化，可是咱们能够将它的值取出来，而后将该值写入输出流。

// 片断1 它的功能等价于片断2
s.writeObject(elementData[i]);  // 传值时，是将实参elementData[i]赋给s.writeObject()的形参
//  片断2
Object temp = new Object();     // temp并无被transient修饰
temp = elementData[i];
s.writeObject(temp);

二、反序列化

ArrayList的反序列化处理原理同上，见源码：

private void readObject(java.io.ObjectInputStream s) throws java.io.IOException, ClassNotFoundException
{
    elementData = EMPTY_ELEMENTDATA;
    // Read in size, and any hidden stuff
    s.defaultReadObject();
    // Read in capacity
    s.readInt(); // ignored
    if (size > 0)
    {
        // be like clone(), allocate array based upon size not capacity
        ensureCapacityInternal(size);
        Object[] a = elementData;
        // Read in all elements in the proper order.
        for (int i = 0; i < size; i++)
        {
            a[i] = s.readObject();
        }
    }
}

从上面源码又引出另一个问题，这些方法都定义为private的，那何时能调用呢？

三、调用

若是一个类不只实现了Serializable接口，并且定义了 readObject（ObjectInputStream in）和 writeObject(ObjectOutputStream out)方法，那么将按照以下的方式进行序列化和反序列化：

ObjectOutputStream会调用这个类的writeObject方法进行序列化，ObjectInputStream会调用相应的readObject方法进行反序列化。

事情究竟是这样的吗？咱们作个小实验，来验明正身。
实验1：

public class TestSerialization implements Serializable
{
    private transient int    num;

    public int getNum()
    {
        return num;
    }

    public void setNum(int num)
    {
        this.num = num;
    }

    private void writeObject(java.io.ObjectOutputStream s)
            throws java.io.IOException
    {
        s.defaultWriteObject();
        s.writeObject(num);
        System.out.println("writeObject of "+this.getClass().getName());
    }

    private void readObject(java.io.ObjectInputStream s)
            throws java.io.IOException, ClassNotFoundException
    {
        s.defaultReadObject();
        num = (Integer) s.readObject();
        System.out.println("readObject of "+this.getClass().getName());
    }

    public static void main(String[] args)
    {
        TestSerialization test = new TestSerialization();
        test.setNum(10);
        System.out.println("序列化以前的值："+test.getNum());
        // 写入
        try
        {
            ObjectOutputStream outputStream = new ObjectOutputStream(
                    new FileOutputStream("D:\\test.tmp"));
            outputStream.writeObject(test);
        } catch (FileNotFoundException e)
        {
            e.printStackTrace();
        } catch (IOException e)
        {
            e.printStackTrace();
        }
        // 读取
        try
        {
            ObjectInputStream oInputStream = new ObjectInputStream(
                    new FileInputStream("D:\\test.tmp"));
            try
            {
                TestSerialization aTest = (TestSerialization) oInputStream.readObject();
                System.out.println("读取序列化后的值："+aTest.getNum());
            } catch (ClassNotFoundException e)
            {
                e.printStackTrace();
            }
        } catch (FileNotFoundException e)
        {
            e.printStackTrace();
        } catch (IOException e)
        {
            e.printStackTrace();
        }
    }
}

输出：

序列化以前的值：10
writeObject of TestSerialization
readObject of TestSerialization
读取序列化后的值：10

实验结果证实，事实确实是如此：

ObjectOutputStream会调用这个类的writeObject方法进行序列化，ObjectInputStream会调用相应的readObject方法进行反序列化。
那么ObjectOutputStream又是如何知道一个类是否实现了writeObject方法呢？又是如何自动调用该类的writeObject方法呢？

答案是：是经过反射机制实现的。

部分解答：

ObjectOutputStream的writeObject又作了哪些事情。它会根据传进来的ArrayList对象获得Class，而后再包装成 ObjectStreamClass，在writeSerialData方法里，会调用ObjectStreamClass的 invokeWriteObject方法，最重要的代码以下：

writeObjectMethod.invoke(obj, new Object[]{ out });

实例变量writeObjectMethod的赋值方式以下：

writeObjectMethod = getPrivateMethod(cl, "writeObject", 
                new Class[] { ObjectOutputStream.class }, 
                Void.TYPE);

 private static Method getPrivateMethod(Class cl, String name,
        Class[] argTypes, Class returnType)
{
    try
    {
        Method meth = cl.getDeclaredMethod(name, argTypes);
        // *****经过反射访问对象的private方法
        meth.setAccessible(true);
        int mods = meth.getModifiers();
        return ((meth.getReturnType() == returnType)
                && ((mods & Modifier.STATIC) == 0) && ((mods & Modifier.PRIVATE) != 0)) ? meth
                : null;
    } catch (NoSuchMethodException ex)
    {
        return null;
    }
}

在作实验时，咱们发现一个问题，那就是为何须要s.defaultWriteObject();和s.defaultReadObject();语句在 readObject(ObjectInputStream o) and writeObject(ObjectOutputStream o)以前呢？

它们的做用以下：

一、It reads and writes all the non transient fields of the class respectively.

二、 These methods also helps in backward and future compatibility. If in future you add some non-transient field to the class and you are trying to deserialize it by the older version of class then the defaultReadObject() method will neglect the newly added field, similarly if you deserialize the old serialized object by the new version then the new non transient field will take default value from JVM

4、为何使用transient修饰elementData？

既然要将ArrayList的字段序列化（即将elementData序列化），那为何又要用transient修饰elementData呢？

回想ArrayList的自动扩容机制，elementData数组至关于容器，当容器不足时就会再扩充容量，可是容器的容量每每都是大于或者等于ArrayList所存元素的个数。

好比，如今实际有了8个元素，那么elementData数组的容量多是8x1.5=12，若是直接序列化elementData数组，那么就会浪费4个元素的空间，特别是当元素个数很是多时，这种浪费是很是不合算的。

因此ArrayList的设计者将elementData设计为transient，而后在writeObject方法中手动将其序列化，而且只序列化了实际存储的那些元素，而不是整个数组。

见源码：

// Write out all elements in the proper order.
for (int i=0; i<size; i++) 
{
    s.writeObject(elementData[i]);
}

从源码中，能够观察到循环时是使用i<size而不是 i<elementData.length，说明序列化时，只需实际存储的那些元素，而不是整个数组。

参考：

一、java.io.Serializable浅析

二、java serializable深刻了解

三、ArrayList源码分析——如何实现Serializable

四、java序列化和反序列话总结