Java集合（2）一 ArrayList 与 LinkList

时间 2019-11-16

标签 java 集合 arraylist linklist 栏目 Java 繁體版

原文原文链接

引言

ArrayList<E>和LinkList<E>在继承关系上都继承自List<E>接口，上篇文章咱们分析了List<E>接口的特色：有序，能够重复，而且能够经过整数索引来访问。他们在自身特色上有不少类似之处，在具体实现上ArrayList<E>和LinkList<E>又有很大不一样，ArrayList<E>经过数组实现，LinkList<E>则使用了双向链表。将他们放到一块儿学习能够更清楚的理解他们的区别。java

框架结构

从上面的结构图能够看出ArrayList<E>和LinkList<E>在继承结构上基本相同，值得注意的是LinkList<E>在继承了List<E>接口的同时还继承了Deque<E>接口。 Deque<E>是一个双端队列的接口，LinkList<E>因为在实现上采用了双向链表，因此能够很天然的实现双端队列头尾进出的特色。node

数据结构

上一篇文章中咱们说过，为何一个Collection<E>接口会衍生出这么多实现类，其中最大的缘由就是每一种实如今数据结构上都有差异，而不一样的数据结构又致使了每种集合在使用场景上又各有不一样。 ArrayList<E>和LinkList<E>的根本区别就在数据结构上，只有了解了他们各自的数据结构，才能更加深刻的明白他们各自的使用场景。在ArrayList<E>的源代码中有一个elementData变量，这个变量就表明了ArrayList<E>所使用的数据结构：数组。编程

//The array buffer into which the elements of the ArrayList are stored.
transient Object[] elementData;
复制代码

elementData变量是ArrayList<E>操做的基础，他全部的操做都是基于elementData这个Object类型的数组来实现的。数组有如下几个特色：数组

数组大小一旦初始化以后，长度固定。
数组中元素之间的内存地址是连续的。
只能存储一种类数据类型的元素。

在这里面有个transient关键字值得注意，他的做用是标志当前对象不须要序列化。若是你们了解序列化，请跳过下面的介绍： 序列化是什么？ 序列化简单说就是将一个对象持久化的过程。将对象转换成字节流的过程就叫序列化，一个对象要在网络中传播就必须被转换成字节流。对应的，一个对象从字节流转换成对象的过程就叫反序列化。在Java中，标志一个对象能够被序列化只须要继承Serializable接口便可，Serializable接口是一个空接口。明白了什么是序列化的概念，再来看transient关键字，java中规定被声明为transient的关键在被序列化的时候会被忽略，但是为何要忽略这个对象呢？若是被忽略了那反序列化的时候这个对象怎样恢复呢？咱们先来想一想什么样的对象在序列化时须要被忽略？序列化是一个耗时也耗费空间的过程，通常在一个对象中除了必须持久化的变量，还会存在不少中间变量或临时变量，声明这些变量的做用是方便咱们操做这个类，举个例子：bash

import java.io.IOException;
import java.io.ObjectInputStream;

public class SerializableDateTime implements java.io.Serializable {

	private static final long serialVersionUID = -8291235042612920489L;

	private String date = "2011-11-11";

	private String time = "11:11";

    //不须要序列化的对象
	private transient String dateTime;

	public void initDateTime() {
		dateTime = date + time;
	}

    //反序列化的时候调用，给dateTime赋值
	private void readObject(ObjectInputStream inputStream) throws IOException, ClassNotFoundException {
		inputStream.defaultReadObject();
		initDateTime();
	}
}
复制代码

SerializableDateTime对象中的dateTime对象若是在外界调用的时候会赋值，可是这个对象并非基础数据，不须要序列化，在反序列化的时候能够经过调用initDateTime返回获取他的值，因此只须要序列化date和time对象便可。将dateTime对象标记为transient，则能够达到按需序列化的目的。那在ArrayList<E>中为何要忽略elementData这个对象呢？主要是由于elementData对象不只包含了全部有用的元素，还存在许多没有未使用的空间，而这些空间是不须要所有序列化的，为了节约空间，因此只序列化了elementData中存有对象的那一部分，在反序列化的时候又恢复elementData对象的空间，这样能够达到节约序列化空间和时间的目的。网络

//序列化时调用
private void writeObject(java.io.ObjectOutputStream s) throws java.io.IOException{
    // Write out element count, and any hidden stuff
    int expectedModCount = modCount;
    s.defaultWriteObject();

    // Write out size as capacity for behavioural compatibility with clone()
    s.writeInt(size);

    // Write out all elements in the proper order.
    //序列化size大小的元素，size的大小是实际存储元素的大小，不是elementData元素的大小
    for (int i=0; i<size; i++) {
        s.writeObject(elementData[i]);
    }

    if (modCount != expectedModCount) {
        throw new ConcurrentModificationException();
    }
}

//反序列化时调用
private void readObject(java.io.ObjectInputStream s) throws java.io.IOException, ClassNotFoundException {
    elementData = EMPTY_ELEMENTDATA;

    // Read in size, and any hidden stuff
    s.defaultReadObject();

    // Read in capacity
    s.readInt(); // ignored

    if (size > 0) {
        // be like clone(), allocate array based upon size not capacity
        //恢复elementData对象的空间
        ensureCapacityInternal(size);

        Object[] a = elementData;
        // Read in all elements in the proper order.
        for (int i=0; i<size; i++) {
            //填充elementData元素的内容
            a[i] = s.readObject();
        }
    }
}
复制代码

这种序列化和反序列化的方法很是巧妙，在咱们编程的过程当中也能够借鉴这种办法来节约序列化和反序列化的空间和时间。数据结构

LinkedList<E>在底层实现上采用了链表这种数据结构，并且是双向链表，即每一个元素都包含他的上一个和下一个元素的引用：框架

//链表的第一个元素
transient Node<E> first;

//链表的最后一个元素
transient Node<E> last;

//链表的内部类表示
private static class Node<E> {
    //当前元素
    E item;
    //下一个元素
    Node<E> next;
    //上一个元素
    Node<E> prev;

    Node(Node<E> prev, E element, Node<E> next) {
        this.item = element;
        this.next = next;
        this.prev = prev;
    }
}
复制代码

链表的特色：dom

长度不固定，能够随时增长和减小
链表中的元素在内存地址上能够是连续的，也能够是不连续的，大部分状况下都是不连续的。

构造函数

ArrayList<E>提供了3种构造方式，默认的构造函数会初始化一个空的数组，在以后添加元素的过程当中会对数组进行扩容，扩容操做在必定程度上会影响数组的性能。若是能提早预估最终的数组使用空间大小，能够经过ArrayList(int initialCapacity) 这种构造方式来初始化数组大小，这样会减小扩容形成的性能损失。函数

public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        //初始化数组大小
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+
                                            initialCapacity);
    }
}

public ArrayList() {
    //初始化一个空的数组
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        // replace with empty array.
        this.elementData = EMPTY_ELEMENTDATA;
    }
}
复制代码

LinkList<E>只提供了2种构造方式，默认的构造函数是一个空函数，由于链表这种数据结构在使用上不须要初始化空间，也不须要扩容，每次须要添加元素时直接追加就能够，在空间的最大化利用上链表比数组更加合理。这并不表明链表使用的空间小，相反，链表每一个节点由于要存储下一个节点引用（双向链表会存储上下两个节点的引用），在相同元素空间使用上会比数组大的多。

public LinkedList() {
}

public LinkedList(Collection<? extends E> c) {
    this();
    addAll(c);
}
复制代码

添加元素

ArrayList<E>在添加元素的过程当中，须要考虑数组空间是否足够，不够的状况下须要扩容。

//ArrayList<E>添加元素到末尾
public boolean add(E e) {
    //检查数组容量，不够就扩容，扩容调用grow(int minCapacity) 方法
    ensureCapacityInternal(size + 1);  
    elementData[size++] = e;
    return true;
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

//扩容
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    //向右位移一位，至关于除以2，比除法运算要快，每次扩容在原容量的基础上增长一半，新的容量为原容量的1.5倍。
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    //拷贝全部数据元素到新的数组中，内部调用System.arraycopy来拷贝全部数组元素
    elementData = Arrays.copyOf(elementData, newCapacity);
}
复制代码

不扩容：扩容：

从中能够看出，不扩容的状况下添加元素到末尾很是方便，时间复杂度为O(1)，扩容的状况下每次都须要拷贝全部元素到新数组，时间复杂度上为O(n)，存在必定性能损耗。
LinkedList<E>在添加元素时因为链表的特性，不须要考虑扩容的问题，但LinkedList<E>每次都须要new一个Node来存储元素。

//LinkedList<E>添加元素到末尾
public boolean add(E e) {
    linkLast(e);
    return true;
}

void linkLast(E e) {
    final Node<E> l = last;
    //new一个新的链表元素并连接到末尾
    final Node<E> newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}
复制代码

ArrayList<E>在添加元素到指定索引位置的时候，除了检查容量以外，因为数组具备在空间连续存储的特性，还须要对插入元素以后的全部节点作一次位移。 ```java //ArrayList添加元素到指定索引位置 public void add(int index, E element) { rangeCheckForAdd(index);

ensureCapacityInternal(size + 1);  // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,size - index);
elementData[index] = element;
size++;
复制代码

}

<img src="http://images2017.cnblogs.com/blog/368583/201711/368583-20171130181801667-175597278.png" style="max-width: 770px">

LinkedList&lt;E>添加到指定位置时首先须要先查找元素的位置，而后添加。
```java
//LinkedList<E>添加元素到指定索引位置
public void add(int index, E element) {
    checkPositionIndex(index);
    
    if (index == size)
        //直接添加元素到末尾
        linkLast(element);
    else
        //添加到指定位置前先查找当前位置已经存在的元素
        linkBefore(element, node(index));
}

//查找指定索引的元素
Node<E> node(int index) {
    // assert isElementIndex(index);
    //指定索引小于元素数量的一半时从first开始遍历，大于元素数量的一半时从last开始遍历
    if (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}
复制代码

LinkedList<E>的这种查找对性能有影响吗？相比ArrayList<E>的扩容以及位移插入位后面全部的元素性能如何？咱们来对插入到头部、尾部以及中间位置3种特殊状况作个简单测试。插入到尾部：

private static void addTailElementArrayList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new ArrayList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addTailElementArrayList time: " + (endTime - startTime));
}

private static void addTailElementLinkedList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new LinkedList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addTailElementLinkedList time: " + (endTime - startTime));
}
复制代码

	100	1000	10000	100000
ArrayList	0	0	1	160
LinkList	0	0	1	110

插入到头部：

private static void addHeadElementArrayList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new ArrayList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(0, i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addHeadElementArrayList time: " + (endTime - startTime));
}

private static void addHeadElementLinkedList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new LinkedList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(0, i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addHeadElementLinkedList time: " + (endTime - startTime));
}
复制代码

	100	1000	10000	100000
ArrayList	0	1	10	900
LinkList	0	1	1	6

插入到中间：

private static void addCenterIndexElementArrayList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new ArrayList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(list.size()>>1, i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addCenterIndexElementArrayList time: " + (endTime - startTime));
}

private static void addCenterIndexElementLinkedList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new LinkedList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(list.size()>>1, i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addCenterIndexElementLinkedList time: " + (endTime - startTime));
}
复制代码

	100	1000	10000	100000
ArrayList	0	1	6	400
LinkList	0	3	80	10000

从中能够得处几个简单结论：

在添加到末尾时，ArrayList<E>和LinkedList<E>在性能上差距不明显，尽管ArrayList<E>须要扩容，但LinkedList<E>也须要new一个Node对象。
在插入到头部时，LinkedList<E>性能明显好于ArrayList<E>，由于ArrayList<E>每次都须要将全部元素向后移动一个位置，而LinkedList<E>因为是双向链表每次只须要改变first元素就能够了。
在插入到中间位置的时候，ArrayList<E>性能优明显好于LinkedList<E>，这是由于ArrayList<E>此时只须要移动一半的元素，而LinkedList<E>由于其双向链表查找元素的特殊性，只能从头或者尾部开始遍历，每次都须要遍历一半的元素，这个操做耗费了大量时间，而ArrayList<E>在扩容以及移动元素上的性能消耗比想象的要小。

咱们在ArrayList<E>和LinkedList<E>的选择上，须要充分考虑使用时的场景，LinkedList<E>在插入数据上并非必定比ArrayList<E>性能好，相反的在不少状况下ArrayList<E>性能反而要好的多。不能由于插入操做多，就必定选用LinkedList<E>，还须要考虑插入元素的位置等其余因素来最终决定。

删除元素

ArrayList<E>删除元素经过遍历元素查找到相等的元素而后使用索引删除，删除以后还要将被删除元素后的元素前移。

public boolean remove(Object o) {
    if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        for (int index = 0; index < size; index++)
            //查找到equals的元素的索引而后删除
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}

private void fastRemove(int index) {
    modCount++;
    int numMoved = size - index - 1;
    if (numMoved > 0)
        //全部删除元素后的元素前移
        System.arraycopy(elementData, index+1, elementData, index,
                            numMoved);
    elementData[--size] = null; // clear to let GC do its work
}
复制代码

LinkedList<E>经过向后遍历链表的方式查找到equals的元素直接删除便可。

public boolean remove(Object o) {
    if (o == null) {
        for (Node<E> x = first; x != null; x = x.next) {
            if (x.item == null) {
                unlink(x);
                return true;
            }
        }
    } else {
        for (Node<E> x = first; x != null; x = x.next) {
            if (o.equals(x.item)) {
                unlink(x);
                return true;
            }
        }
    }
    return false;
}
复制代码

遍历元素

在遍历元素上ArrayList<E>存在更有效的方式，他实现了RandomAccess接口，表明ArrayList<E>支持快速访问。 RandomAccess自己是一个空接口，这种接口通常用来表明一类特征，RandomAccess表明实现类具备快速访问的特征。ArrayList<E>实现快速访问的方式是经过索引。这表明ArrayList<E>在遍历时经过for循环方式要比经过Iterator或ListIterator迭代器方式要快。LinkedList<E>没有实现这个借口，因此通常仍是经过Iterator迭代器来访问。