Java集合源码分析（一）ArrayList

时间 2019-11-13

标签 java 集合源码分析 arraylist 栏目 Java 繁體版

原文原文链接

前言java

　　在前面的学习集合中只是介绍了集合的相关用法，咱们想要更深刻的去了解集合那就要经过咱们去分析它的源码来了解它。但愿对集合有一个更进一步的理解！设计模式

　　既然是看源码那咱们要怎么看一个类的源码呢？这里我推荐的方法是：api

　　　　1）看继承结构数组

　　　　　　看这个类的层次结构，处于一个什么位置，能够在本身内心有个大概的了解。安全

　　　　2）看构造方法数据结构

　　　　　　在构造方法中，看作了哪些事情，跟踪方法中里面的方法。app

　　　　3）看经常使用的方法less

　　　　　　跟构造方法同样，这个方法实现功能是如何实现的dom

　　注：既然是源码，为何要这样设计类，有这样的继承关系。这就要说到设计模式的问题了。因此咱们要了解经常使用的设计模式，才能更深入的去理解这个类。ide

1、ArrayList简介

1.一、ArrayList概述

　　1）ArrayList是能够动态增加和缩减的索引序列，它是基于数组实现的List类。

　　2）该类封装了一个动态再分配的Object[]数组，每个类对象都有一个capacity属性，表示它们所封装的Object[]数组的长度，当向ArrayList中添加元素时，该属性值会自动增长。

　　　　若是想ArrayList中添加大量元素，可以使用ensureCapacity方法一次性增长capacity，能够减小增长重分配的次数提升性能。

　　3）ArrayList的用法和Vector向相似，可是Vector是一个较老的集合，具备不少缺点，不建议使用。

　　　　另外，ArrayList和Vector的区别是：ArrayList是线程不安全的，当多条线程访问同一个ArrayList集合时，程序须要手动保证该集合的同步性，而Vector则是线程安全的。

　　4）ArrayList和Collection的关系：

1.二、ArrayList的数据结构

　　分析一个类的时候，数据结构每每是它的灵魂所在，理解底层的数据结构其实就理解了该类的实现思路，具体的实现细节再具体分析。

　　ArrayList的数据结构是：

　　说明：底层的数据结构就是数组，数组元素类型为Object类型，便可以存放全部类型数据。咱们对ArrayList类的实例的全部的操做底层都是基于数组的。

2、ArrayList源码分析

2.一、继承结构和层次关系

　　咱们看一下ArrayList的继承结构：

　　　　　　　　ArrayList extends AbstractList

　　　　　　　　AbstractList extends AbstractCollection

　　全部类都继承Object 因此ArrayList的继承结构就是上图这样。

　　分析：

　　　　1）为何要先继承AbstractList，而让AbstractList先实现List<E>？而不是让ArrayList直接实现List<E>？

　　　　　　这里是有一个思想，接口中全都是抽象的方法，而抽象类中能够有抽象方法，还能够有具体的实现方法，正是利用了这一点，让AbstractList是实现接口中一些通用的方法，而具体的类，

　　　　　　如ArrayList就继承这个AbstractList类，拿到一些通用的方法，而后本身在实现一些本身特有的方法，这样一来，让代码更简洁，就继承结构最底层的类中通用的方法都抽取出来，

　　　　　　先一块儿实现了，减小重复代码。因此通常看到一个类上面还有一个抽象类，应该就是这个做用。

　　　　2）ArrayList实现了哪些接口？

　　　　　　List<E>接口：咱们会出现这样一个疑问，在查看了ArrayList的父类AbstractList也实现了List<E>接口，那为何子类ArrayList仍是去实现一遍呢？

　　　　　　　　　　　　这是想不通的地方，因此我就去查资料，有的人说是为了查看代码方便，使观看者一目了然，说法不一，但每个让我感受合理的，可是在stackOverFlow中找到了答案，这里其实颇有趣。

　　　　　　　　　　　　网址贴出来 http://stackoverflow.com/questions/2165204/why-does-linkedhashsete-extend-hashsete-and-implement-sete开发这个collection 的做者Josh说。

　　　　　　　　　　　　这实际上是一个mistake，由于他写这代码的时候以为这个会有用处，可是其实并没什么用，但由于没什么影响，就一直留到了如今。

　　　　　　RandomAccess接口：这个是一个标记性接口，经过查看api文档，它的做用就是用来快速随机存取，有关效率的问题，在实现了该接口的话，那么使用普通的for循环来遍历，性能更高，例如arrayList。

　　　　　　　　　　　　　　　　而没有实现该接口的话，使用Iterator来迭代，这样性能更高，例如linkedList。因此这个标记性只是为了让咱们知道咱们用什么样的方式去获取数据性能更好。

　　　　　　Cloneable接口：实现了该接口，就可使用Object.Clone()方法了。

　　　　　　Serializable接口：实现该序列化接口，代表该类能够被序列化，什么是序列化？简单的说，就是可以从类变成字节流传输，而后还能从字节流变成原来的类。

2.二、类中的属性

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    // 版本号
    private static final long serialVersionUID = 8683452581122892189L;
    // 缺省容量
    private static final int DEFAULT_CAPACITY = 10;
    // 空对象数组
    private static final Object[] EMPTY_ELEMENTDATA = {};
    // 缺省空对象数组
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
    // 元素数组
    transient Object[] elementData;
    // 实际元素大小，默认为0
    private int size;
    // 最大数组容量
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
}

2.三、构造方法

　　ArrayList有三个构造方法：

　　1）无参构造方法　　

/**
    * Constructs an empty list with an initial capacity of ten.　　这里就说明了默认会给10的大小，因此说一开始arrayList的容量是10.
    */
　　　　//ArrayList中储存数据的其实就是一个数组，这个数组就是elementData，在123行定义的 private transient Object[] elementData;
　　 public ArrayList() {　　
        super();        //调用父类中的无参构造方法，父类中的是个空的构造方法
        this.elementData = EMPTY_ELEMENTDATA;//EMPTY_ELEMENTDATA：是个空的Object[]， 将elementData初始化，elementData也是个Object[]类型。空的Object[]会给默认大小10，等会会解释何时赋值的。
    }

　　　备注：

　　2）有参构造函数一

/**
     * Constructs an empty list with the specified initial capacity.
     *
     * @param  initialCapacity  the initial capacity of the list
     * @throws IllegalArgumentException if the specified initial capacity
     *         is negative
     */
    public ArrayList(int initialCapacity) {
        super(); //父类中空的构造方法
        if (initialCapacity < 0)    //判断若是自定义大小的容量小于0，则报下面这个非法数据异常
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        this.elementData = new Object[initialCapacity]; //将自定义的容量大小当成初始化elementData的大小
    }

　　3）有参构造方法三(不经常使用)

//这个构造方法不经常使用，举个例子就能明白什么意思
    /*
        Strudent exends Person
         ArrayList<Person>、 Person这里就是泛型
        我还有一个Collection<Student>、因为这个Student继承了Person，那么根据这个构造方法，我就能够把这个Collection<Student>转换为ArrayList<Sudent>这就是这个构造方法的做用 
    */
     public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();    //转换为数组
        size = elementData.length;   //数组中的数据个数
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class) //每一个集合的toarray()的实现方法不同，因此须要判断一下，若是不是Object[].class类型，那么久须要使用ArrayList中的方法去改造一下。
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    }

　　总结：arrayList的构造方法就作一件事情，就是初始化一下储存数据的容器，其实本质上就是一个数组，在其中就叫elementData。

2.四、核心方法

　　2.4.一、add()方法（有四个）

　　　　1）boolean add(E)；//默认直接在末尾添加元素

/**
     * Appends the specified element to the end of this list.添加一个特定的元素到list的末尾。
     *
     * @param e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {    
    //肯定内部容量是否够了，size是数组中数据的个数，由于要添加一个元素，因此size+1，先判断size+1的这个个数数组可否放得下，就在这个方法中去判断是否数组.length是否够用了。
        ensureCapacityInternal(size + 1);  // Increments modCount!!
     //在数据中正确的位置上放上元素e，而且size++
        elementData[size++] = e;
        return true;
    }

　　　　分析：

　　　　　　ensureCapacityInternal(xxx);　肯定内部容量的方法　　　

private void ensureCapacityInternal(int minCapacity) {
        if (elementData == EMPTY_ELEMENTDATA) { //看，判断初始化的elementData是否是空的数组，也就是没有长度
    //由于若是是空的话，minCapacity=size+1；其实就是等于1，空的数组没有长度就存放不了，因此就将minCapacity变成10，也就是默认大小，可是带这里，尚未真正的初始化这个elementData的大小。
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }
    //确认实际的容量，上面只是将minCapacity=10，这个方法就是真正的判断elementData是否够用
        ensureExplicitCapacity(minCapacity);
    }

　　　　　　ensureExplicitCapacity(xxx)；

private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
//minCapacity若是大于了实际elementData的长度，那么就说明elementData数组的长度不够用，不够用那么就要增长elementData的length。这里有的同窗就会模糊minCapacity究竟是什么呢，这里给大家分析一下

/*第一种状况：因为elementData初始化时是空的数组，那么第一次add的时候，minCapacity=size+1；也就minCapacity=1，在上一个方法(肯定内部容量ensureCapacityInternal)就会判断出是空的数组，就会给 　　将minCapacity=10，到这一步为止，尚未改变elementData的大小。
　第二种状况：elementData不是空的数组了，那么在add的时候，minCapacity=size+1；也就是minCapacity表明着elementData中增长以后的实际数据个数，拿着它判断elementData的length是否够用，若是length 不够用，那么确定要扩大容量，否则增长的这个元素就会溢出。 */


        if (minCapacity - elementData.length > 0)
    //arrayList能自动扩展大小的关键方法就在这里了
            grow(minCapacity);
    }

　　　　　　grow(xxx); arrayList核心的方法，能扩展数组大小的真正秘密。

private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;  //将扩充前的elementData大小给oldCapacity
        int newCapacity = oldCapacity + (oldCapacity >> 1);//newCapacity就是1.5倍的oldCapacity
        if (newCapacity - minCapacity < 0)//这句话就是适应于elementData就空数组的时候，length=0，那么oldCapacity=0，newCapacity=0，因此这个判断成立，在这里就是真正的初始化elementData的大小了，就是为10.前面的工做都是准备工做。
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)//若是newCapacity超过了最大的容量限制，就调用hugeCapacity，也就是将能给的最大值给newCapacity
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
    //新的容量大小已经肯定好了，就copy数组，改变容量大小咯。
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

　　　　　hugeCapacity();

//这个就是上面用到的方法，很简单，就是用来赋最大值。
    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
//若是minCapacity都大于MAX_ARRAY_SIZE，那么就Integer.MAX_VALUE返回，反之将MAX_ARRAY_SIZE返回。由于maxCapacity是三倍的minCapacity，可能扩充的太大了，就用minCapacity来判断了。
//Integer.MAX_VALUE:2147483647   MAX_ARRAY_SIZE：2147483639  也就是说最大也就能给到第一个数值。仍是超过了这个限制，就要溢出了。至关于arraylist给了两层防御。
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

　　　　2）void add(int，E)；在特定位置添加元素，也就是插入元素

public void add(int index, E element) {
        rangeCheckForAdd(index);//检查index也就是插入的位置是否合理。

//跟上面的分析同样，具体看上面
        ensureCapacityInternal(size + 1);  // Increments modCount!!
//这个方法就是用来在插入元素以后，要将index以后的元素都日后移一位，
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
//在目标位置上存放元素
        elementData[index] = element;
        size++;//size增长1
    }

　　　　分析：

　　　　　　rangeCheckForAdd(index)　　

    private void rangeCheckForAdd(int index) {
        if (index > size || index < 0)   //插入的位置确定不能大于size 和小于0
//若是是，就报这个越界异常
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

　　　　　　System.arraycopy(...)：就是将elementData在插入位置后的全部元素日后面移一位。查看api文档　

public static void arraycopy(Object src,
int srcPos,
Object dest,
int destPos,
int length)
src：源对象
srcPos：源对象对象的起始位置
dest：目标对象
destPost：目标对象的起始位置
length：从起始位置日后复制的长度。

//这段的大概意思就是解释这个方法的用法，复制src到dest，复制的位置是从src的srcPost开始，到srcPost+length-1的位置结束，复制到destPost上，从destPost开始到destPost+length-1的位置上，
Copies an array from the specified source array, beginning at the specified position, to the specified position of the destination array. A subsequence of array components are copied from
the source array referenced by src to the destination array referenced by dest. The number of components copied is equal to the length argument. The components at positions srcPos through srcPos+length-1
in the source array are copied into positions destPos through destPos+length-1, respectively, of the destination array.

//告诉你复制的一种状况，若是A和B是同样的，那么先将A复制到临时数组C，而后经过C复制到B，用了一个第三方参数
If the src and dest arguments refer to the same array object, then the copying is performed as if the components at positions srcPos through srcPos+length-1 were first copied to
a temporary array with length components and then the contents of the temporary array were copied into positions destPos through destPos+length-1 of the destination array.

//这一大段，就是来讲明会出现的一些问题，NullPointerException和IndexOutOfBoundsException 还有ArrayStoreException 这三个异常出现的缘由。
If dest is null, then a NullPointerException is thrown.

If src is null, then a NullPointerException is thrown and the destination array is not modified.

Otherwise, if any of the following is true, an ArrayStoreException is thrown and the destination is not modified:

The src argument refers to an object that is not an array.
The dest argument refers to an object that is not an array.
The src argument and dest argument refer to arrays whose component types are different primitive types.
The src argument refers to an array with a primitive component type and the dest argument refers to an array with a reference component type.
The src argument refers to an array with a reference component type and the dest argument refers to an array with a primitive component type.
Otherwise, if any of the following is true, an IndexOutOfBoundsException is thrown and the destination is not modified:

The srcPos argument is negative.
The destPos argument is negative.
The length argument is negative.
srcPos+length is greater than src.length, the length of the source array.
destPos+length is greater than dest.length, the length of the destination array.

//这里描述了一种特殊的状况，就是当A的长度大于B的长度的时候，会复制一部分，而不是彻底失败。
Otherwise, if any actual component of the source array from position srcPos through srcPos+length-1 cannot be converted to the component type of the destination array by assignment conversion, an ArrayStoreException is thrown.
In this case, let k be the smallest nonnegative integer less than length such that src[srcPos+k] cannot be converted to the component type of the destination array; when the exception is thrown, source array components from positions
srcPos through srcPos+k-1 will already have been copied to destination array positions destPos through destPos+k-1 and no other positions of the destination array will have been modified. (Because of the restrictions already itemized,

this paragraph effectively applies only to the situation where both arrays have component types that are reference types.)

//这个参数列表的解释，一开始就说了，
Parameters:
src - the source array.
srcPos - starting position in the source array.
dest - the destination array.
destPos - starting position in the destination data.
length - the number of array elements to be copied.

arraycopy

　　总结：

　　　　正常状况下会扩容1.5倍，特殊状况下（新扩展数组大小已经达到了最大值）则只取最大值。

　　　　当咱们调用add方法时，实际上的函数调用以下：

　　　　说明：程序调用add，实际上还会进行一系列调用，可能会调用到grow，grow可能会调用hugeCapacity。

　　举例说明一：　　　　

　　List<Integer> lists = new ArrayList<Integer>(6);
　　lists.add(8);

　　　　说明：初始化lists大小为0，调用的ArrayList()型构造函数，那么在调用lists.add(8)方法时，会通过怎样的步骤呢？下图给出了该程序执行过程和最初与最后的elementData的大小。

　　　　说明：咱们能够看到，在add方法以前开始elementData = {}；调用add方法时会继续调用，直至grow，最后elementData的大小变为10，以后再返回到add函数，把8放在elementData[0]中。

　　举例说明二：　　　

　　List<Integer> lists = new ArrayList<Integer>(6);
　　lists.add(8);

　　　　说明：调用的ArrayList(int)型构造函数，那么elementData被初始化为大小为6的Object数组，在调用add(8)方法时，具体的步骤以下：

　　　　说明：咱们能够知道，在调用add方法以前，elementData的大小已经为6，以后再进行传递，不会进行扩容处理。

　　2.4.二、删除方法

　　　　其实这几个删除方法都是相似的。咱们选择几个讲，其中fastRemove(int)方法是private的，是提供给remove(Object)这个方法用的。

　　　　1）remove(int)：经过删除指定位置上的元素

public E remove(int index) {
        rangeCheck(index);//检查index的合理性

        modCount++;//这个做用不少，好比用来检测快速失败的一种标志。
        E oldValue = elementData(index);//经过索引直接找到该元素

        int numMoved = size - index - 1;//计算要移动的位数。
        if (numMoved > 0)
//这个方法也已经解释过了，就是用来移动元素的。
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
//将--size上的位置赋值为null，让gc(垃圾回收机制)更快的回收它。
        elementData[--size] = null; // clear to let GC do its work
//返回删除的元素。
        return oldValue;
    }

　　　　2）remove(Object)：这个方法能够看出来，arrayList是能够存放null值得。

//感受这个不怎么要分析吧，都看得懂，就是经过元素来删除该元素，就依次遍历，若是有这个元素，就将该元素的索引传给fastRemobe(index)，使用这个方法来删除该元素， //fastRemove(index)方法的内部跟remove(index)的实现几乎同样，这里最主要是知道arrayList能够存储null值
     public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true;
                }
        } else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true;
                }
        }
        return false;
    }

　　　　3）clear()：将elementData中每一个元素都赋值为null，等待垃圾回收将这个给回收掉，因此叫clear

public void clear() {
        modCount++;

        // clear to let GC do its work
        for (int i = 0; i < size; i++)
            elementData[i] = null;

        size = 0;
    }

　　　　4）removeAll(collection c)：

     public boolean removeAll(Collection<?> c) {
         return batchRemove(c, false);//批量删除
     }

　　　　5）batchRemove(xx,xx)：用于两个方法，一个removeAll()：它只清楚指定集合中的元素，retainAll()用来测试两个集合是否有交集。　

//这个方法，用于两处地方，若是complement为false，则用于removeAll若是为true，则给retainAll()用，retainAll（）是用来检测两个集合是否有交集的。
   private boolean batchRemove(Collection<?> c, boolean complement) {
        final Object[] elementData = this.elementData; //将原集合，记名为A
        int r = 0, w = 0;   //r用来控制循环，w是记录有多少个交集
        boolean modified = false;  
        try {
            for (; r < size; r++)
//参数中的集合C一次检测集合A中的元素是否有，
                if (c.contains(elementData[r]) == complement)
//有的话，就给集合A
                    elementData[w++] = elementData[r];
        } finally {
            // Preserve behavioral compatibility with AbstractCollection,
            // even if c.contains() throws.
//若是contains方法使用过程报异常
            if (r != size) {
//将剩下的元素都赋值给集合A，
                System.arraycopy(elementData, r,
                                 elementData, w,
                                 size - r);
                w += size - r;
            }
            if (w != size) {
//这里有两个用途，在removeAll()时，w一直为0，就直接跟clear同样，全是为null。
//retainAll()：没有一个交集返回true，有交集但不全交也返回true，而两个集合相等的时候，返回false，因此不能根据返回值来确认两个集合是否有交集，而是经过原集合的大小是否发生改变来判断，若是原集合中还有元素，则表明有交集，而元集合没有元素了，说明两个集合没有交集。
                // clear to let GC do its work
                for (int i = w; i < size; i++)
                    elementData[i] = null;
                modCount += size - w;
                size = w;
                modified = true;
            }
        }
        return modified;
    }

　　总结：：remove函数用户移除指定下标的元素，此时会把指定下标到数组末尾的元素向前移动一个单位，而且会把数组最后一个元素设置为null，

　　　　　　这样是为了方便以后将整个数组不被使用时，会被GC，能够做为小的技巧使用。

　　2.4.三、set()方法

public E set(int index, E element) {
        // 检验索引是否合法
        rangeCheck(index);
        // 旧值
        E oldValue = elementData(index);
        // 赋新值
        elementData[index] = element;
        // 返回旧值
        return oldValue;
    }

　　说明：设定指定下标索引的元素值

　　2.4.四、indexOf()方法

// 从首开始查找数组里面是否存在指定元素
    public int indexOf(Object o) {
        if (o == null) { // 查找的元素为空
            for (int i = 0; i < size; i++) // 遍历数组，找到第一个为空的元素，返回下标
                if (elementData[i]==null)
                    return i;
        } else { // 查找的元素不为空
            for (int i = 0; i < size; i++) // 遍历数组，找到第一个和指定元素相等的元素，返回下标
                if (o.equals(elementData[i]))
                    return i;
        } 
        // 没有找到，返回空
        return -1;
    }

　　说明：从头开始查找与指定元素相等的元素，注意，是能够查找null元素的，意味着ArrayList中能够存放null元素的。与此函数对应的lastIndexOf，表示从尾部开始查找。

　　2.4.五、get()方法

public E get(int index) {
        // 检验索引是否合法
        rangeCheck(index);

        return elementData(index);
    }

　　说明：get函数会检查索引值是否合法（只检查是否大于size，而没有检查是否小于0），值得注意的是，在get函数中存在element函数，element函数用于返回具体的元素，具体函数以下：

E elementData(int index) {
        return (E) elementData[index];
    }

　　说明：返回的值都通过了向下转型（Object -> E），这些是对咱们应用程序屏蔽的小细节。

3、总结　

1）arrayList能够存放null。
2）arrayList本质上就是一个elementData数组。
3）arrayList区别于数组的地方在于可以自动扩展大小，其中关键的方法就是gorw()方法。
4）arrayList中removeAll(collection c)和clear()的区别就是removeAll能够删除批量指定的元素，而clear是全是删除集合中的元素。
5）arrayList因为本质是数组，因此它在数据的查询方面会很快，而在插入删除这些方面，性能降低不少，有移动不少数据才能达到应有的效果
6）arrayList实现了RandomAccess，因此在遍历它的时候推荐使用for循环。

喜欢就“推荐”哦！