详解Java 8中Stream类型的“懒”加载

时间 2019-11-09

标签详解 java stream 类型加载栏目 Java 繁體版

原文原文链接

在进入正题以前，咱们须要先引入Java 8中Stream类型的两个很重要的操做：java

中间和终结操做(Intermediate and Terminal Operation)

Stream类型有两种类型的方法：app

中间操做(Intermediate Operation)
终结操做(Terminal Operation)

官方文档给出的描述为［不想看字母的请直接跳过］：less

Stream operations are divided into intermediate and terminal operations, and are combined to form stream pipelines. A stream pipeline consists of a source (such as a Collection, an array, a generator function, or an I/O channel); followed by zero or more intermediate operations such as Stream.filter or Stream.map; and a terminal operation such as Stream.forEach or Stream.reduce.

Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.

Terminal operations, such as Stream.forEach or IntStream.sum, may traverse the stream to produce a result or a side-effect. After the terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used; if you need to traverse the same data source again, you must return to the data source to get a new stream. In almost all cases, terminal operations are eager, completing their traversal of the data source and processing of the pipeline before returning. Only the terminal operations iterator() and spliterator() are not; these are provided as an "escape hatch" to enable arbitrary client-controlled pipeline traversals in the event that the existing operations are not sufficient to the task.

Processing streams lazily allows for significant efficiencies; in a pipeline such as the filter-map-sum example above, filtering, mapping, and summing can be fused into a single pass on the data, with minimal intermediate state. Laziness also allows avoiding examining all the data when it is not necessary; for operations such as "find the first string longer than 1000 characters", it is only necessary to examine just enough strings to find one that has the desired characteristics without examining all of the strings available from the source. (This behavior becomes even more important when the input stream is infinite and not merely large.)

其实看完这个官方文档，撸主整我的是很蒙圈的，给你们讲讲官方文档这段话到底说了些什么：ide

第一段：流操做分为中间操做和终结操做（我就这么翻译了啊），这两种操做外加数据源就构成了所谓的pipeline，处理管道。spa

第二段：说中间操做会返回一个流；中间操做是懒的(lazy，究竟怎么个懒法，咱们后面会讲到）；还拿filter举了个例子说，执行中间操做filter的时候实际上并无进行任何的过滤操做，而是建立了一个新的流，这个新流包含啥呢？包含的是在遍历原来流（initial stream）过程当中符合筛选条件的元素（很奇怪哎，这不明显是一个过滤操做吗？怎么说没有呢）；要注意的是：中间操做在pipeline执行到终结操做以前是不会开始执行的（这将在咱们后面的内容中讲到）；翻译

第三段：人家说了，终结操做是eager的，也就是说，执行到终结操做的时候我就要开始遍历数据源而且执行中间操做这个过程了，不会再去等谁了。并且一旦pipeline中的终结操做完成了，那么这个pipeline的使命就完成了，若是你还有新的终结操做，那么对不起，这个旧的pipeline就用不了了，你得新建一个stream，而后在造一遍轮子。这里有一句话我实在没弄明白什么意思啊，"code

Only the terminal operations iterator() and spliterator() are not; these are provided as an "escape hatch" to enable arbitrary client-controlled pipeline traversals in the event that the existing operations are not sufficient to the task.

"，还但愿道友们帮忙解释一下，感激涕零！orm

第四段：夸了一下stream“懒”执行的好处：效率高。将中间操做融合在一块儿，使操做对对象的状态改变最小化；并且还能使咱们避免一些不必的工做，给了个例子：在一堆字符串里要找出第一个含超过1000个字符的字符串，经过stream operation的laziness那么咱们就不用遍历所有元素了，只需执行能找出知足条件的元素的操做就行（其实这个需求不经过stream pipeline也能作到不是吗？）；其实最重要的仍是当面对一个无限数据源的操做时，它的不可替代性才体现了出来，由于经典java中collection是finite的，固然这个不是咱们今天的目标，这里就不拓展开讲了。对象

愿文档后面还有一点内容，讲了中间操做有的是持有状态的(stateful)，有的是无状态的(stateless)，他们在对原数据的遍历上也有一些不一样感兴趣的同窗可本身去研究研究，咱们今天主要仍是看看中间操做是怎么个“懒”法以及这个“懒”的过程是怎么样的。blog

Stream之因此“懒”的秘密也在于每次在使用Stream时，都会链接多个中间操做，并在最后附上一个结束操做。像map()和filter()这样的方法是中间操做，在调用它们时，会当即返回另外一个Stream对象。而对于reduce()及findFirst()这样的方法，它们是终结操做，在调用它们时才会执行真正的操做来获取须要的值。

从一个例子出发：

好比，当咱们须要打印出第一个长度为3的大写名字时：

public class LazyStreams {
    private static int length(final String name) {
        System.out.println("getting length for " + name);
        return name.length();
    }
    private static String toUpper(final String name ) {
        System.out.println("converting to uppercase: " + name);
        return name.toUpperCase();
    }
    public static void main(final String[] args) {
        List<String> names = Arrays.asList("Brad", "Kate", "Kim", "Jack", "Joe", "Mike", "Susan", "George", "Robert", "Julia", "Parker", "Benson");

        final String firstNameWith3Letters = names.stream()
            .filter(name -> length(name) == 3)
            .map(name -> toUpper(name))
            .findFirst()
            .get();

        System.out.println(firstNameWith3Letters);
    }
}

你可能认为以上的代码会对names集合进行不少操做，好比首先遍历一次集合获得长度为3的全部名字，再遍历一次filter获得的集合，将名字转换为大写。最后再从大写名字的集合中找到第一个并返回。这也是经典状况下Java Eager处理的角度。此时的处理顺序是这样的

对于Stream操做，更好的代码阅读顺序是从右到左，或者从下到上。每个操做都只会作到恰到好处。若是以Eager的视角来阅读上述代码，它也许会执行15步操做：

但是实际状况并非这样，不要忘了Stream但是很是“懒”的，它不会执行任何多余的操做。实际上，只有当findFirst方法被调用时，filter和map方法才会被真正触发。而filter也不会一口气对整个集合实现过滤，它会一个个的过滤，若是发现了符合条件的元素，会将该元素置入到下一个中间操做，也就是map方法中。因此实际的状况是这样的：

控制台的输出是这样的：

getting length for Brad
getting length for Kate
getting length for Kim
converting to uppercase: Kim
KIM

为了更好理解上述过程，咱们将Lambda表达式换为经典的Java写法，即匿名内部类的形式：

final String firstNameWith3Letters = names.stream()
            .filter(new Predicate<String>{
                public boolean test(String name){
                    return length(name)==3;
                }
             })
            .map(new Function<String,String>{
                public String apply(String name){
                    return toUpper(name);
                }
            })
            .findFirst()
            .get();

执行的见下图：

很容易得出以前的结论：只有当findFirst方法被调用时，filter和map方法才会被真正触发。而filter也不会一口气对整个集合实现过滤，它会一个个的过滤，若是发现了符合条件的元素，会将该元素置入到下一个中间操做，也就是map方法中。

当终结操做得到了它须要的答案时，整个计算过程就结束了。若是没有得到到答案，那么它会要求中间操做对更多的集合元素进行计算，直到找到答案或者整个集合被处理完毕。

JDK会将全部的中间操做合并成一个，这个过程被称为熔断操做(Fusing Operation)。所以，在最坏的状况下(即集合中没有符合要求的元素)，集合也只会被遍历一次，而不会像咱们想象的那样执行了屡次遍历，也许这就回答了官方文档中为何说"Processing streams lazily allows for significant efficiencies"了。

为了看清楚在底层发生的事情，咱们能够将以上对Stream的操做按照类型进行分割：

Stream<String> namesWith3Letters = names.stream()
    .filter(name -> length(name) == 3)
    .map(name -> toUpper(name));

System.out.println("Stream created, filtered, mapped...");
System.out.println("ready to call findFirst...");

final String firstNameWith3Letters = namesWith3Letters.findFirst().get();

System.out.println(firstNameWith3Letters);

// 输出结果 // Stream created, filtered, mapped... // ready to call findFirst... // getting length for Brad // getting length for Kate // getting length for Kim // converting to uppercase: Kim // KIM

根据输出的结果，咱们能够发如今声明了Strema对象上的中间操做以后，中间操做并无被执行。只有当真正发生了findFirst()调用以后，才会执行中间操做。

参考资料：

撸主比较懒，上文中的例子和前两张图来自于 CSDN 博主 dm_vincent 的博客《 [Java 8] (7) 利用Stream类型的"懒"操做》