继续研读JDK的源码,在比较HashMap
和ConcurrentHashMap
的不一样之处发现了一个细节——关于Iterator
的实现的不一样,其实HashMap
和ConcurrentHashMap
还有更多不一样的地方,这也是面试常常问到的问题,有一篇文章我以为讲的很好了,Java进阶(六)从ConcurrentHashMap的演进看Java多线程核心技术。
Iterator是一种设计模式,在Java Collection Framework
中常常做为容器的视图(view),大多数时候只支持删除、不支持增长,提供统一的接口方法等特色。在Java Collection Framework
的Iterator
实现中大多数是fast-fail
方式的,而支持并发的容器数据结构则没有这个限制。javascript
1)使用Iterator遍历字符串列表java
List<String> lists = Arrays.asList("a","b","c"); Iterator<String> iterator = lists.iterator(); while (iterator.hasNext()) { String val = iterator.next(); System.out.println(val); }
这种作法是for..each的语法的展开形式node
for(String val: lists){ //sout }
2)使用Iterator遍历LinkedList面试
LinkedList<String> linkedList = new LinkedList<>(lists); iterator = linkedList.iterator(); while (iterator.hasNext()) { String val = iterator.next(); System.out.println(val); }
3) 使用Iterator遍历HashMap设计模式
Map<String,Integer> hmap = new HashMap<>(3); hmap.put("a",1); hmap.put("b",2); hmap.put("c",3); Iterator<Map.Entry<String,Integer>> mapIterator = hmap.entrySet().iterator(); while (mapIterator.hasNext()) { Map.Entry<String,Integer> entry = mapIterator.next(); System.out.println(entry.getKey() + ":" + entry.getValue()); }
1)ArrayList中的Iterator数组
list中的结构是顺序的,Iterator既然是List的视图,那它也表现了相同的顺序。
ArrayList得到Iterator,数据结构
/** * Returns an iterator over the elements in this list in proper sequence. * * <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>. * * @return an iterator over the elements in this list in proper sequence */ public Iterator<E> iterator() { return new Itr(); }
源码,多线程
/** * An optimized version of AbstractList.Itr */ private class Itr implements Iterator<E> { int cursor; // index of next element to return int lastRet = -1; // index of last element returned; -1 if no such int expectedModCount = modCount; public boolean hasNext() { return cursor != size; } @SuppressWarnings("unchecked") public E next() { checkForComodification(); int i = cursor; if (i >= size) throw new NoSuchElementException(); Object[] elementData = ArrayList.this.elementData; if (i >= elementData.length) throw new ConcurrentModificationException(); cursor = i + 1; return (E) elementData[lastRet = i]; } public void remove() { if (lastRet < 0) throw new IllegalStateException(); checkForComodification(); try { ArrayList.this.remove(lastRet); cursor = lastRet; lastRet = -1; expectedModCount = modCount; } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } } final void checkForComodification() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); } }
Itr
是ArrayList
的一个内部类,它能使用宿主类的成员变量,事实上Itr
反映了ArrayList的内部状况,使用了size
、expectedModCount
和elementData
等属性。经过游标cursor的方式不断往前递进,只要游标小于size就说明依然还有元素能够访问。
应该看到的是,在调用了new Iterator()
以后,能够看作Itr
对ArrayList
作了快照,这里的快照并非很严格,是基于modCount
比较来实现的。它在初始化时备份了modCount
的值,保存为私有的变量expectedModCount
。闭包
首先Iterator
接口并无诸如add的方法,即不能经过Iterator来为容器增长元素;
其次,若是有其余线程变化了容器的结构(structural modification),那么ArrayList.this.modCount
的值会发生改变,那么在Itr
执行next或者remove方法时会判断出来modCount != expectedModCount
的状况,从而抛出异常fast-fail
。
再次,若是执行了Itr
的remove方法,它可以调用ArrayList.this.remove
的方法,而后修正游标和expectedModCount
等。并发
ArrayList.this.remove(lastRet); cursor = lastRet; lastRet = -1; expectedModCount = modCount;
2)LinkedList中的Iterator
LinkedList
的Iterator
和ArrayList
中的有一些相似的地方。
首先,LinkedList
的iterator入口方法实际上是AbstractSequentialList
抽象类中,
/** * Returns an iterator over the elements in this list (in proper * sequence).<p> * * This implementation merely returns a list iterator over the list. * * @return an iterator over the elements in this list (in proper sequence) */ public Iterator<E> iterator() { return listIterator(); } /** * Returns a list iterator over the elements in this list (in proper * sequence). * * @param index index of first element to be returned from the list * iterator (by a call to the <code>next</code> method) * @return a list iterator over the elements in this list (in proper * sequence) * @throws IndexOutOfBoundsException {@inheritDoc} */ public abstract ListIterator<E> listIterator(int index);
而这个ListIterator
是一个接口,它被LinkedList$ListItr
实现,
private class ListItr implements ListIterator<E> { private Node<E> lastReturned = null; private Node<E> next; private int nextIndex; private int expectedModCount = modCount; ListItr(int index) { // assert isPositionIndex(index); next = (index == size) ? null : node(index); nextIndex = index; } public boolean hasNext() { return nextIndex < size; } public E next() { checkForComodification(); if (!hasNext()) throw new NoSuchElementException(); lastReturned = next; next = next.next; nextIndex++; return lastReturned.item; } public boolean hasPrevious() { return nextIndex > 0; } public E previous() { checkForComodification(); if (!hasPrevious()) throw new NoSuchElementException(); lastReturned = next = (next == null) ? last : next.prev; nextIndex--; return lastReturned.item; } public int nextIndex() { return nextIndex; } public int previousIndex() { return nextIndex - 1; } public void remove() { checkForComodification(); if (lastReturned == null) throw new IllegalStateException(); Node<E> lastNext = lastReturned.next; unlink(lastReturned); if (next == lastReturned) next = lastNext; else nextIndex--; lastReturned = null; expectedModCount++; } public void set(E e) { if (lastReturned == null) throw new IllegalStateException(); checkForComodification(); lastReturned.item = e; } public void add(E e) { checkForComodification(); lastReturned = null; if (next == null) linkLast(e); else linkBefore(e, next); nextIndex++; expectedModCount++; } final void checkForComodification() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); } }
LinkedList
的Iterator
要比ArrayList
中的复杂一些,它更支持了add等方法;
相似原来游标的遍历方式,基于size
、expectedModCount
等比较逻辑依然存在,只不过遍历的方式不是原来的下标增进,而是节点之间的next指针来实现。
3)HashMap中的Iterator
HashMap
有多个view视图,keySet
, values
, entrySet
,这里分析下entrySet
这个视图,另外两个原理和entrySet
视图的差很少。
private final class EntrySet extends AbstractSet<Map.Entry<K,V>> { public Iterator<Map.Entry<K,V>> iterator() { return newEntryIterator(); } public boolean contains(Object o) { if (!(o instanceof Map.Entry)) return false; Map.Entry<K,V> e = (Map.Entry<K,V>) o; Entry<K,V> candidate = getEntry(e.getKey()); return candidate != null && candidate.equals(e); } public boolean remove(Object o) { return removeMapping(o) != null; } public int size() { return size; } public void clear() { HashMap.this.clear(); } }
EntrySet的iterator方法中调用了newEntryIterator
,将构造EntryIterator
实例,EntryIterator
源码
private final class EntryIterator extends HashIterator<Map.Entry<K,V>> { public Map.Entry<K,V> next() { return nextEntry(); } }
EntryIterator
继承了HashIterator
类,复用了父类的大部分方法,只是覆盖了next方法。HashIterator
源码,
private abstract class HashIterator<E> implements Iterator<E> { Entry<K,V> next; // next entry to return int expectedModCount; // For fast-fail int index; // current slot Entry<K,V> current; // current entry HashIterator() { expectedModCount = modCount; if (size > 0) { // advance to first entry Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ; } } public final boolean hasNext() { return next != null; } final Entry<K,V> nextEntry() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); Entry<K,V> e = next; if (e == null) throw new NoSuchElementException(); if ((next = e.next) == null) { Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ; } current = e; return e; } public void remove() { if (current == null) throw new IllegalStateException(); if (modCount != expectedModCount) throw new ConcurrentModificationException(); Object k = current.key; current = null; HashMap.this.removeEntryForKey(k); expectedModCount = modCount; } }
因为HashMap的结构并非顺序的,在执行Iterator.next方法时不能经过next指针或下标的方式直接找到下一个元素,HashIterator
为了能达到这个目的,在构造函数和nextEntry
方法中预先作了advance
处理。
//构造函数中 if (size > 0) { // advance to first entry Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ; } //nextEntry中 if ((next = e.next) == null) { Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ; }
构造函数中预先在HashMap的table数组找到第一个头结点不为null的元素;(next = t[index++]) == null
的写法有点迷惑性,不考虑HashMap为空的状况,index自增停在next != null
的状况,即 next = t[index-1], index已经往前一步了;
在nextEntry中若是发现e.next是null,此时表示table这个数组元素的链表遍历结束了,须要跳到下一个头节点不为空的元素继续遍历,而index恰好往前一步了,此时继续执行
next = t[index++]
假设next[index]不为空,那么下一个遍历的数组元素头节点找到,而且index已经自增了。
以ConcurrentHashMap
为例,看ConcurrentHashMap$HashInteraotr
的实现
abstract class HashIterator { int nextSegmentIndex; int nextTableIndex; HashEntry<K,V>[] currentTable; HashEntry<K, V> nextEntry; HashEntry<K, V> lastReturned; HashIterator() { nextSegmentIndex = segments.length - 1; nextTableIndex = -1; advance(); } /** * Set nextEntry to first node of next non-empty table * (in backwards order, to simplify checks). */ final void advance() { for (;;) { if (nextTableIndex >= 0) { if ((nextEntry = entryAt(currentTable, nextTableIndex--)) != null) break; } else if (nextSegmentIndex >= 0) { Segment<K,V> seg = segmentAt(segments, nextSegmentIndex--); if (seg != null && (currentTable = seg.table) != null) nextTableIndex = currentTable.length - 1; } else break; } } final HashEntry<K,V> nextEntry() { HashEntry<K,V> e = nextEntry; if (e == null) throw new NoSuchElementException(); lastReturned = e; // cannot assign until after null check if ((nextEntry = e.next) == null) advance(); return e; } public final boolean hasNext() { return nextEntry != null; } public final boolean hasMoreElements() { return nextEntry != null; } public final void remove() { if (lastReturned == null) throw new IllegalStateException(); ConcurrentHashMap.this.remove(lastReturned.key); lastReturned = null; } }
这里能看到ConcurrentHashMap的segment分段因素所在,在构造函数中指定了最后一个segment数组元素,而后作advance处理,也是从后往前处理的。首先找到不为null的分段segment,而后才是在segment的table数组中找到不为null的元素,这都是从后往前“前进”的。
而与HashMap不一样的地方,ConcurrentHashMap的Iterator并非fast-fail
的,它并无判断modCount;除此以外还应该看到它对nextEntry
的处理,在advance的方法调用如下两个方法,
/** * Gets the jth element of given segment array (if nonnull) with * volatile element access semantics via Unsafe. (The null check * can trigger harmlessly only during deserialization.) Note: * because each element of segments array is set only once (using * fully ordered writes), some performance-sensitive methods rely * on this method only as a recheck upon null reads. */ @SuppressWarnings("unchecked") static final <K,V> Segment<K,V> segmentAt(Segment<K,V>[] ss, int j) { long u = (j << SSHIFT) + SBASE; return ss == null ? null : (Segment<K,V>) UNSAFE.getObjectVolatile(ss, u); } /** * Gets the ith element of given table (if nonnull) with volatile * read semantics. Note: This is manually integrated into a few * performance-sensitive methods to reduce call overhead. */ @SuppressWarnings("unchecked") static final <K,V> HashEntry<K,V> entryAt(HashEntry<K,V>[] tab, int i) { return (tab == null) ? null : (HashEntry<K,V>) UNSAFE.getObjectVolatile (tab, ((long)i << TSHIFT) + TBASE); }
它们都是调用了UNSAFE.getObjectVolatile
方法,利用了volatile access的方式,相较于上锁的方式性能更好。
这个例子来自MDN的文档,作法比较简洁,迭代器
function makeIterator(array){ var nextIndex = 0; return { next: function(){ return nextIndex < array.length ? {value: array[nextIndex++], done: false} : {done: true}; } }; } var it = makeIterator(['yo', 'ya']); console.log(it.next().value); // 'yo' console.log(it.next().value); // 'ya' console.log(it.next().done); // true
能够考虑给这个makeIterator
的返回值加上hasNext
属性,
return { next: ..., hasNext: function() { return nextIndex < array.length; } }
JavaScript利用了闭包实现了Iterator和Java利用内部类实现有类似的地方。
Iterator的主要目的仍是为了表现底层数据结构的全部元素,提供一种统一的遍历方式。在不一样的数据结构须要针对不一样语义作出改动,像LinkedList
的支持add方法,像ConcurrentHashMap
和HashMap
的advance
处理,像ConcurrentHashMap
那样不判断modeCount
而使用volatile access
等。