java8-02-Stream-API

时间 2019-12-08

标签 java8 java stream api 栏目 Java 繁體版

原文原文链接

[TOC]java

0 Stream简介

家庭住址：java.util.stream.Stream<T>linux
出生年月：Java8问世的时候他就来到了世上git
主要技能：那能够吹上三天三夜了……github
主要特征docker
- 不改变输入源apache
- 中间的各类操做是lazy的(惰性求值、延迟操做)编程
- 只有当开始消费流的时候，流才有意义api
- 隐式迭代app
……less

整体感受，Stream至关于一个进化版的Iterator。Java8源码里是这么注释的：

A sequence of elements supporting sequential and parallel aggregate operations

能够方便的对集合进行遍历、过滤、映射、汇聚、切片等复杂操做。最终汇聚成一个新的Stream，不改变原始数据。而且各类复杂的操做都是lazy的，也就是说会尽量的将全部的中间操做在最终的汇聚操做一次性完成。

比起传统的对象和数据的操做，Stream更专一于对流的计算,和传说中的函数式编程有点相似。

他具体进化的多牛逼，本身体验吧。

给一组输入数据:

List<Integer> list = Arrays.asList(1, null, 3, 1, null, 4, 5, null, 2, 0);

求输入序列中非空奇数之和，而且相同奇数算做同一个。

在lambda还在娘胎里的时候，为了实现这个功能，可能会这么作

int s = 0;
// 先放在Set里去重
Set<Integer> set = new HashSet<>(list);
for (Integer i : set) {
  if (i != null && (i & 1) == 0) {
    s += i;
  }
}
System.out.println(s);

当lambda和Stream双剑合璧以后：

int sum = list.stream().filter(e -> e != null && (e & 1) == 1).distinct().mapToInt(i -> i).sum();

1 获取Stream

从lambda的其余好基友那里获取Stream

从1.8开始，接口中也能够存在 default 修饰的方法了。

java.util.Collection<E> 中有以下声明：

public interface Collection<E> extends Iterable<E> {
    // 获取普通的流
    default Stream<E> stream() {
        return StreamSupport.stream(spliterator(), false);
    }
    // 获取并行流
    default Stream<E> parallelStream() {
        return StreamSupport.stream(spliterator(), true);
    }
}

java.util.Arrays中有以下声明：

public static <T> Stream<T> stream(T[] array) {
        return stream(array, 0, array.length);
    }

    public static IntStream stream(int[] array) {
        return stream(array, 0, array.length);
    }

    // 其余相似的方法再也不一一列出

示例

List<String> strs = Arrays.asList("apache", "spark");
Stream<String> stringStream = strs.stream();

IntStream intStream = Arrays.stream(new int[] { 1, 25, 4, 2 });

经过Stream接口获取

Stream<String> stream = Stream.of("hello", "world");
Stream<String> stream2 = Stream.of("haha");
Stream<HouseInfo> stream3 = Stream.of(new HouseInfo[] { new HouseInfo(), new HouseInfo() });

Stream<Integer> stream4 = Stream.iterate(1, i -> 2 * i + 1);

Stream<Double> stream5 = Stream.generate(() -> Math.random());

注意：Stream.iterate()和 Stream.generate()生成的是无限流，通常要手动limit 。

2 转换Stream

流过滤、流切片

这部分相对来讲还算简单明了，看个例子就够了

// 获取流
Stream<String> stream = Stream.of(//
    null, "apache", null, "apache", "apache", //
    "github", "docker", "java", //
    "hadoop", "linux", "spark", "alifafa");

stream// 去除null,保留包含a的字符串
    .filter(e -> e != null && e.contains("a"))//
    .distinct()// 去重,固然要有equals()和hashCode()方法支持了
    .limit(3)// 只取知足条件的前三个
    .forEach(System.out::println);// 消费流

map/flatMap

Stream的map定义以下：

<R> Stream<R> map(Function<? super T, ? extends R> mapper);

也就是说，接收一个输入(T:当前正在迭代的元素)，输出另外一种类型(R)。

Stream.of(null, "apache", null, "apache", "apache", //
          "hadoop", "linux", "spark", "alifafa")//

  .filter(e -> e != null && e.length() > 0)//
  .map(str -> str.charAt(0))//取出第一个字符
  .forEach(System.out::println);

sorted

排序也比较直观，有两种：

// 按照元素的Comparable接口的实现来排序
Stream<T> sorted();

// 指定Comparator来自定义排序
Stream<T> sorted(Comparator<? super T> comparator);

示例:

List<HouseInfo> houseInfos = Lists.newArrayList(//
    new HouseInfo(1, "恒大星级公寓", 100, 1), //
    new HouseInfo(2, "汇智湖畔", 999, 2), //
    new HouseInfo(3, "张江汤臣豪园", 100, 1), //
    new HouseInfo(4, "保利星苑", 23, 10), //
    new HouseInfo(5, "北顾小区", 66, 23), //
    new HouseInfo(6, "北杰公寓", null, 55), //
    new HouseInfo(7, "保利星苑", 77, 66), //
    new HouseInfo(8, "保利星苑", 111, 12)//
);

houseInfos.stream().sorted((h1, h2) -> {
    if (h1 == null || h2 == null)
      return 0;
    if (h1.getDistance() == null || h2.getDistance() == null)
      return 0;
    int ret = h1.getDistance().compareTo(h2.getDistance());
    if (ret == 0) {
      if (h1.getBrowseCount() == null || h2.getBrowseCount() == null)
        return 0;
      return h1.getBrowseCount().compareTo(h2.getBrowseCount());
    }
    return ret;
});

3 终止/消费Stream

条件测试、初级统计操做

List<Integer> list = Arrays.asList(1, 2, 3, 4, 5);

// 是否是全部元素都大于零
System.out.println(list.stream().allMatch(e -> e > 0));
// 是否是存在偶数
System.out.println(list.stream().anyMatch(e -> (e & 1) == 0));
// 是否是都不小于零
System.out.println(list.stream().noneMatch(e -> e < 0));

// 找出第一个大于等于4的元素
Optional<Integer> optional = list.stream().filter(e -> e >= 4).findFirst();
// 若是存在的话,就执行ifPresent中指定的操做
optional.ifPresent(System.out::println);

// 大于等于4的元素的个数
System.out.println(list.stream().filter(e -> e >= 4).count());
// 获取最小的
System.out.println(list.stream().min(Integer::compareTo));
// 获取最大的
System.out.println(list.stream().max(Integer::compareTo));
// 先转换成IntStream,max就不须要比较器了
System.out.println(list.stream().mapToInt(i -> i).max());

reduce

这个词不知道怎么翻译，有人翻译为 规约 或 汇聚。

反正就是将通过一系列转换后的流中的数据最终收集起来，收集的同时可能会反复 apply 某个 reduce函数。

reduce()方法有如下两个重载的变体：

// 返回的不是Optional,由于正常状况下至少有参数identity能够保证返回值不会为null
T reduce(T identity, BinaryOperator<T> accumulator);

<U> U reduce(U identity,
             BiFunction<U, ? super T, U> accumulator,
             BinaryOperator<U> combiner);

示例：

// 遍历元素，反复apply (i,j)->i+j的操做
Integer reduce = Stream.iterate(1, i -> i + 1)//1,2,3,...,10,...
    .limit(10)//
    .reduce(0, (i, j) -> i + j);//55


Optional<Integer> reduce2 = Stream.iterate(1, i -> i + 1)//
    .limit(10)//
    .reduce((i, j) -> i + j);

collect

该操做很好理解，顾名思义就是将Stream中的元素collect到一个地方。

最常规(最不经常使用)的collect方法

// 最牛逼的每每是最不经常使用的,毕竟这个方法理解起来太过复杂了
<R> R collect(Supplier<R> supplier,
              BiConsumer<R, ? super T> accumulator,
              BiConsumer<R, R> combiner);
// 至于这个方法的参数含义，请看下面的例子

一个参数的版本

<R, A> R collect(Collector<? super T, A, R> collector);

Collector接口(他不是函数式接口，无法使用lambda)的关键代码以下：

public interface Collector<T, A, R> {
    /**
     *
     */
    Supplier<A> supplier();

    /**
     * 
     */
    BiConsumer<A, T> accumulator();

    /**
     * 
     */
    BinaryOperator<A> combiner();

    /**
     *
     */
    Function<A, R> finisher();

    /**
     * 
     */
    Set<Characteristics> characteristics();

}

先来看一个关于三个参数的collect()方法的例子，除非特殊状况，否则我保证你看了以后这辈子都不想用它……

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);
ArrayList<Integer> ret1 = numbers.stream()//
    .map(i -> i * 2)// 扩大两倍
    .collect(//
    () -> new ArrayList<Integer>(), //参数1
    (list, e) -> list.add(e), //参数2
    (list1, list2) -> list1.addAll(list2)//参数3
);

/***
 * <pre>
 * collect()方法的三个参数解释以下：
 * 1. () -> new ArrayList<Integer>() 
 *         生成一个新的用来存储结果的集合
 * 2. (list, e) -> list.add(e)
 *         list：是参数1中生成的新集合
 *         e：是Stream中正在被迭代的当前元素
 *         该参数的做用就是将元素添加到新生成的集合中
 * 3. (list1, list2) -> list1.addAll(list2)
 *         合并集合
 * </pre>
 ***/

ret1.forEach(System.out::println);

不使用lambda的时候，等价的代码应该是这个样子的……

List<Integer> ret3 = numbers.stream()//
    .map(i -> i * 2)// 扩大两倍
    .collect(new Supplier<List<Integer>>() {
      @Override
      public List<Integer> get() {
        // 只是为了提供一个集合来存储元素
        return new ArrayList<>();
      }
    }, new BiConsumer<List<Integer>, Integer>() {
      @Override
      public void accept(List<Integer> list, Integer e) {
        // 将当前元素添加至第一个参数返回的容器中
        list.add(e);
      }
    }, new BiConsumer<List<Integer>, List<Integer>>() {

      @Override
      public void accept(List<Integer> list1, List<Integer> list2) {
        // 合并容器
        list1.addAll(list2);
      }
  });

ret3.forEach(System.out::println);

是否是被恶心到了……

一样的，用Java调用spark的api的时候，若是没有lambda的话，比上面的代码还恶心……

顺便打个免费的广告，能够看看本大侠这篇使用各类版本实现的Spark的HelloWorld: http://blog.csdn.net/hylexus/...，来证实一下有lambda的世界是有多么幸福……

不过，当你理解了三个参数的collect方法以后，可使用构造器引用和方法引用来使代码更简洁：

ArrayList<Integer> ret2 = numbers.stream()//
    .map(i -> i * 2)// 扩大两倍
    .collect(//
    ArrayList::new, //
    List::add, //
    List::addAll//
);

ret2.forEach(System.out::println);

Collectors工具的使用(高级统计操做)

上面的三个和一个参数的collect()方法都异常复杂，最经常使用的仍是一个参数的版本。可是那个Collector本身实现的话仍是很恶心。

还好，经常使用的Collect操做对应的Collector都在java.util.stream.Collectors 中提供了。很强大的工具……

如下示例都是对该list的操做：

List<HouseInfo> houseInfos = Lists.newArrayList(//
    new HouseInfo(1, "恒大星级公寓", 100, 1), // 小区ID，小区名，浏览数，距离
    new HouseInfo(2, "汇智湖畔", 999, 2), //
    new HouseInfo(3, "张江汤臣豪园", 100, 1), //
    new HouseInfo(4, "保利星苑", 111, 10), //
    new HouseInfo(5, "北顾小区", 66, 23), //
    new HouseInfo(6, "北杰公寓", 77, 55), //
    new HouseInfo(7, "保利星苑", 77, 66), //
    new HouseInfo(8, "保利星苑", 111, 12)//
);

好了，开始装逼之旅 ^_^ ……

提取小区名

// 获取全部小区名，放到list中
List<String> ret1 = houseInfos.stream()
      .map(HouseInfo::getHouseName).collect(Collectors.toList());
ret1.forEach(System.out::println);

// 获取全部的小区名，放到set中去重
// 固然也可先distinct()再collect到List中
Set<String> ret2 = houseInfos.stream()
      .map(HouseInfo::getHouseName).collect(Collectors.toSet());
ret2.forEach(System.out::println);

// 将全部的小区名用_^_链接起来
// 恒大星级公寓_^_汇智湖畔_^_张江汤臣豪园_^_保利星苑_^_北顾小区_^_北杰公寓_^_保利星苑_^_保利星苑
String names = houseInfos.stream()
      .map(HouseInfo::getHouseName).collect(Collectors.joining("_^_"));
System.out.println(names);

// 指定集合类型为ArrayList
ArrayList<String> collect = houseInfos.stream()
      .map(HouseInfo::getHouseName)
      .collect(Collectors.toCollection(ArrayList::new));

最值

// 获取浏览数最高的小区
Optional<HouseInfo> ret3 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 过滤掉浏览数为空的
  .collect(Collectors.maxBy((h1, h2) -> Integer.compare(h1.getBrowseCount(), h2.getBrowseCount())));
System.out.println(ret3.get());

// 获取最高浏览数
Optional<Integer> ret4 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 去掉浏览数为空的
  .map(HouseInfo::getBrowseCount)// 取出浏览数
  .collect(Collectors.maxBy(Integer::compare));// 方法引用，比较浏览数
System.out.println(ret4.get());

总数、总和

// 获取总数
// 其实这个操做直接用houseInfos.size()就能够了，此处仅为演示语法
Long total = houseInfos.stream().collect(Collectors.counting());
System.out.println(total);

// 浏览数总和
Integer ret5 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 过滤掉浏览数为空的
  .collect(Collectors.summingInt(HouseInfo::getBrowseCount));
System.out.println(ret5);

// 浏览数总和
Integer ret6 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 过滤掉浏览数为空的
  .map(HouseInfo::getBrowseCount).collect(Collectors.summingInt(i -> i));
System.out.println(ret6);

// 浏览数总和
int ret7 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 过滤掉浏览数为空的
  .mapToInt(HouseInfo::getBrowseCount)// 先转换为IntStream后直接用其sum()方法
  .sum();
System.out.println(ret7);

均值

// 浏览数平均值
Double ret8 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 过滤掉浏览数为空的
  .collect(Collectors.averagingDouble(HouseInfo::getBrowseCount));
System.out.println(ret8);

// 浏览数平均值
OptionalDouble ret9 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 过滤掉浏览数为空的
  .mapToDouble(HouseInfo::getBrowseCount)// 先转换为DoubleStream后直接用其average()方法
  .average();
System.out.println(ret9.getAsDouble());

统计信息

// 获取统计信息
DoubleSummaryStatistics statistics = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)
  .collect(Collectors.summarizingDouble(HouseInfo::getBrowseCount));
System.out.println("avg:" + statistics.getAverage());
System.out.println("max:" + statistics.getMax());
System.out.println("sum:" + statistics.getSum());

分组

// 按浏览数分组
Map<Integer, List<HouseInfo>> ret10 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 过滤掉浏览数为空的
  .collect(Collectors.groupingBy(HouseInfo::getBrowseCount));
ret10.forEach((count, house) -> {
  System.out.println("BrowseCount:" + count + " " + house);
});

// 多级分组
// 先按浏览数分组,二级分组用距离分组
Map<Integer, Map<String, List<HouseInfo>>> ret11 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null && h.getDistance() != null)//
  .collect(Collectors.groupingBy(
      HouseInfo::getBrowseCount,
      Collectors.groupingBy((HouseInfo h) -> {
          if (h.getDistance() <= 10)
            return "较近";
          else if (h.getDistance() <= 20)
            return "近";
          return "较远";
    })));

//结果大概长这样
ret11.forEach((count, v) -> {
  System.out.println("浏览数:" + count);
  v.forEach((desc, houses) -> {
    System.out.println("\t" + desc);
    houses.forEach(h -> System.out.println("\t\t" + h));
  });
});
/****
 * <pre>
 *  浏览数:66
        较远
            HouseInfo [houseId=5, houseName=北顾小区, browseCount=66, distance=23]
    浏览数:100
        较近
            HouseInfo [houseId=1, houseName=恒大星级公寓, browseCount=100, distance=1]
            HouseInfo [houseId=3, houseName=张江汤臣豪园, browseCount=100, distance=1]
    浏览数:999
        较近
            HouseInfo [houseId=2, houseName=汇智湖畔, browseCount=999, distance=2]
    浏览数:77
        较远
            HouseInfo [houseId=6, houseName=北杰公寓, browseCount=77, distance=55]
            HouseInfo [houseId=7, houseName=保利星苑, browseCount=77, distance=66]
    浏览数:111
        近
            HouseInfo [houseId=8, houseName=保利星苑, browseCount=111, distance=12]
        较近
            HouseInfo [houseId=4, houseName=保利星苑, browseCount=111, distance=10]
 * 
 * </pre>
 * 
 ****/

分区

// 按距离分区(两部分)
Map<Boolean, List<HouseInfo>> ret12 = houseInfos.stream()//
  .filter(h -> h.getDistance() != null)//
  .collect(Collectors.partitioningBy(h -> h.getDistance() <= 20));
/****
         * <pre>
         *  较远
                    HouseInfo [houseId=5, houseName=北顾小区, browseCount=66, distance=23]
                    HouseInfo [houseId=6, houseName=北杰公寓, browseCount=77, distance=55]
                    HouseInfo [houseId=7, houseName=保利星苑, browseCount=77, distance=66]
            较近
                    HouseInfo [houseId=1, houseName=恒大星级公寓, browseCount=100, distance=1]
                    HouseInfo [houseId=2, houseName=汇智湖畔, browseCount=999, distance=2]
                    HouseInfo [houseId=3, houseName=张江汤臣豪园, browseCount=100, distance=1]
                    HouseInfo [houseId=4, houseName=保利星苑, browseCount=111, distance=10]
                    HouseInfo [houseId=8, houseName=保利星苑, browseCount=111, distance=12]
         * 
         * </pre>
         ****/
ret12.forEach((t, houses) -> {
  System.out.println(t ? "较近" : "较远");
  houses.forEach(h -> System.out.println("\t\t" + h));
});


Map<Boolean, Map<Boolean, List<HouseInfo>>> ret13 = houseInfos.stream()//
  .filter(h -> h.getDistance() != null)//
  .collect(
          Collectors.partitioningBy(h -> h.getDistance() <= 20,
        Collectors.partitioningBy(h -> h.getBrowseCount() >= 70))
);

/*****
         * <pre>
         *  较远
                浏览较少
                    HouseInfo [houseId=5, houseName=北顾小区, browseCount=66, distance=23]
                浏览较多
                    HouseInfo [houseId=6, houseName=北杰公寓, browseCount=77, distance=55]
                    HouseInfo [houseId=7, houseName=保利星苑, browseCount=77, distance=66]
            较近
                浏览较少
                浏览较多
                    HouseInfo [houseId=1, houseName=恒大星级公寓, browseCount=100, distance=1]
                    HouseInfo [houseId=2, houseName=汇智湖畔, browseCount=999, distance=2]
                    HouseInfo [houseId=3, houseName=张江汤臣豪园, browseCount=100, distance=1]
                    HouseInfo [houseId=4, houseName=保利星苑, browseCount=111, distance=10]
                    HouseInfo [houseId=8, houseName=保利星苑, browseCount=111, distance=12]
         * </pre>
         ****/

ret13.forEach((less, value) -> {
  System.out.println(less ? "较近" : "较远");
  value.forEach((moreCount, houses) -> {
    System.out.println(moreCount ? "\t浏览较多" : "\t浏览较少");
    houses.forEach(h -> System.out.println("\t\t" + h));
  });
});

更多相关文章...