从源码的角度分析List与Set的区别

时间 2019-12-04

标签源码角度分析 list set 区别繁體版

原文原文链接

不少时候咱们在讨论List与Set的异同点时都在说：数组

　　一、List、Set都实现了Collection接口this

　　二、List是有序的，能够存储重复的元素，容许存入nullspa

　　三、Set是无序的，不容许存储重复的元素，只容许存入一个nullcode

　　四、List查询效率高，但插入删除效率低orm

　　五、Set检索元素效率低、但删除插入效率高对象

　　六、List能够经过索引操做元素，Set不能根据索引获取到元素blog

这里基于ArrayList/HashSet（jdk1.8）的角度进行分析二者的异同点：继承

1、从各自所继承的父类以及实现接口，可知二者的源头是一致的，都是从Collection接口延伸出来索引

2、建立ArrayList实例接口

不指定初始容量时，建立一个空的对象数组，与原有注释不一致，注释写着建立一个长度为10的数组，实际是在添加元素时进行判断处理的。还能够指定容量建立相应长度的数组、也能够传入一个集合来进行建立，此处就再也不详细列出

private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

由建立实例可知，ArrayList底层是由数组实现的，因是数组的方式实现，而数组实际是有序数据的集合，因此List也就相应是有序的，且也是能存入null。

3、建立HashSet实例

在不指定容量的状况下建立HashSet时，直接是去建立一个HashMap实例，用HashMap来实现HashSet的相关功能。能够指定容量和负载因子等进行建立，实际也是HashMap的其余构造方法

 /**
  * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
  * default initial capacity (16) and load factor (0.75).
  */
 public HashSet() {
     map = new HashMap<>();
 }

由建立实例可知，底层实现为HashMap，既然是HashMap，按HashMap的规则，咱们也就可知HashSet是无序的，由于HashMap的存储方式是先用key进行hash找到相应位置，而后再在该位置存储对应的key和value。HashSet不容许存储重复元素，是由于HashSet在添加元素时，是用该元素做为key，一个空的对象做为value进行存储的，也就相应的只能存储一个null

具体可见下图：

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
/**
 * Adds the specified element to this set if it is not already present.
 * More formally, adds the specified element <tt>e</tt> to this set if
 * this set contains no element <tt>e2</tt> such that
 * <tt>(e==null&nbsp;?&nbsp;e2==null&nbsp;:&nbsp;e.equals(e2))</tt>.
 * If this set already contains the element, the call leaves the set
 * unchanged and returns <tt>false</tt>.
 *
 * @param e element to be added to this set
 * @return <tt>true</tt> if this set did not already contain the specified
 * element
 */
public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

4、List查询效率高，但插入删除效率低

a、ArrayList是数组的方式实现，数组在内存中是连续且成块存储的，在查询时直接根据索引来获取数组相应值，因此查询效率会比较高

/**
 * Returns the element at the specified position in this list.
 *
 * @param  index index of the element to return
 * @return the element at the specified position in this list
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E get(int index) {
    rangeCheck(index);

    return elementData(index);
}

b、在作插入和删除操做时，若是直接添加一个元素，不指定索引时，直接添加到数组的末尾，也会比较快，若是须要按索引去添加时就须要对原有数组的相应后续元素进行复制移位再进行对该索引对应的位置进行赋值，删除一样，把该索引的后续元素复制前移一位，而后把最后一个索引置空待GC，所以插入和删除操做效率会相对较低

//直接添加元素，在数组的最后添加（在添加元素以前，先判断数组长度是否知足，若是第一次添加，则建立一个长度为10的数组，若是非第一次添加，则判断数组长度是否充足，若是不长度不够则建立一个长度为（oldCapacity + (oldCapacity >> 1)）的新数组，
//并把旧舒服复制到新的数组中来）
public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

//先判断索引是否有效，再判断数组长度是否充足，不充足则参照上面流程添加新数组，再进行数组的复制移位
public void add(int index, E element) {
    rangeCheckForAdd(index);

    ensureCapacityInternal(size + 1);  // Increments modCount!!
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}

//如上面解释
public E remove(int index) {
    rangeCheck(index);

    modCount++;
    E oldValue = elementData(index);

    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work

    return oldValue;
}

5、Set检索元素效率低、但删除插入效率高

HashSet没有get单个值得方法。在检索元素时，须要先获取map的全部数据，再进行遍历比对，因此效率会比ArrayList低。而在删除插入时，根据map的特性，只须要要对所插入的对象做为key进行hash找到相应位置，而后放入该元素便可，因此相对ArrayList的须要进行数组的复制移位来讲效率会相对较高

//如上面解释，HashSet的remove实际也就是map的remove
public boolean remove(Object o) {
    return map.remove(o)==PRESENT;
}

6、List能够经过索引操做元素，Set不能根据索引获取到元素

由上面分析可知，ArrayList是数组实现，存在索引可操做元素，而HashSet是HashMap实现，须要对元素进行hash找到位置再存储，不存在直接索引获取的方式