第十章 Scala 容器基础(二十四)：给集合建立一个懒加载视图

时间 2019-11-12

标签第十 scala 容器基础二十四集合建立一个加载视图栏目 Scala 繁體版

原文原文链接

Problem

你正在使用一个巨大的集合，而且想建立一个懒加载的版本。只有在计算或者返回结果时才真正被调用。java

Solution

除了Stream类，不论何时你建立一个Scala集合类的实例，你都建立了一个strict版本集合（任何操做都会被当即执行）。这意味着若是你新建了一个百万元素的集合，这些元素会当即加载进内存。在Java中这是正常的，可是在Scala中你能够选择在集合上新建一个视图。视图可让结果nonstrict，或者懒加载。这改变告终果集合，因此当调用集合的转换方法的时候，只有真正要访问集合元素的时候才会执行计算，而且不像平时那样是“当即执行”。（转换方法是把一个输入集合转化为一个输出集合。）es6

你能够看下建立集合的时候使用view与不使用view的区别：算法

scala> val nums = 1 to 100
nums: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)

scala> val numsView = (1 to 100).view
numsView: scala.collection.SeqView[Int,scala.collection.immutable.IndexedSeq[Int]] = SeqView(...)

不使用view建立一个Range就像你指望的结果同样，一个100个元素的Range。然而，使用view的Range在REPL中出现了不一样的输出结果，一个叫作SeqView的东西。
数据库

这个SeqView带有以下信息：
数组

集合元素类型为Int性能
输出结果scala.collection.immutable.IndexedSeq[Int]，暗示了你使用force方法把view转回正常集合时候你能获得的集合元素类型。spa

你会看到下面的信息，若是你强制把一个view转回一个普通集合：scala

scala> val numsView = (1 to 100).view
numsView: scala.collection.SeqView[Int,scala.collection.immutable.IndexedSeq[Int]] = SeqView(...)

scala> val x = numsView.force
x: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)

存在许多中方法能看到使用一个集合view的效果。首先，咱们来看一看foreach方法，它好像没什么区别。代理

scala> (1 to 100).foreach(x => print(x + " "))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 
scala> (1 to 100).view.foreach(x => print(x + " "))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

这两个例子都会直接打印出集合的100个元素，由于foreach方法并非一个转换方法，因此对结果没有影响。
code

可是当你调用一个转换方法的时候，你会戏剧性的发现结果变得不一样了：

scala> (1 to 10).map(_ * 2)
res61: scala.collection.immutable.IndexedSeq[Int] = Vector(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)

scala> (1 to 10).view.map(_ * 2)
res62: scala.collection.SeqView[Int,Seq[_]] = SeqViewM(...)

结果不一样了，应为map是一个转换方法。咱们来使用下面的代码来更深层次的展现一下这种不一样：

scala> (1 to 10).map{x => {
     |   Thread.sleep(1000)
     |   x * 2
     | }}
res68: scala.collection.immutable.IndexedSeq[Int] = Vector(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)

scala> (1 to 10).view.map{x => {
     |   Thread.sleep(1000)
     |   x * 2
     | }}
res69: scala.collection.SeqView[Int,Seq[_]] = SeqViewM(...)

不是用view的时候，程序会等待10秒，而后直接返回结果。使用view，程序则直接返回scala.collection.SeqView。

Discussion

Scala文档对view作出了一个说明：“仅仅对集合的结果构造了代理，它的元素构件只有一个要求...一个view是一个特殊类型的集合，它实现了集合的一些基本方法，可是对全部的transformers实现了懒加载”

一个transformer方法是可以从一个原有集合构造一个新的集合。这样的方法包括map，filter，reverse等等。当你使用这些方法的时候，你就在把一个输入集合转化为一个输出集合。

这就解释了为何foreach方法在使用view和没有使用view时没有任何区别：它不是一个transformer方法。可是map方法和其余transformer方法好比reverse，就能够有懒加载的效果：

scala> val l = List(1,2,3)
l: List[Int] = List(1, 2, 3)

scala> l.view.reverse
res70: scala.collection.SeqView[Int,List[Int]] = SeqViewR(...)

Use cases

对于view，有两个主要的使用场景：

性能
像处理数据库视图同样处理集合

关于性能，驾驶你遇到一种状况，不得不处理一个十亿元素的集合。若是你不得不作的话，你确定不但愿直接在10亿元素上运行一个算法，因此这时候使用一个视图是有意义的。

第二个应用场景让你使用Scala view就像使用一个数据库view同样。下面这段代码展现了如何把一个scala集合view看成一个数据库view使用：

scala> val arr = (1 to 10).toArray
arr: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> val view = arr.view.slice(2, 5)
view: scala.collection.mutable.IndexedSeqView[Int,Array[Int]] = SeqViewS(...)

scala> arr(2) = 42

scala> view.foreach(println)
42
4
5

scala> view(0) = 10

scala> view(1) = 20

scala> view(2) = 30

scala> arr
res76: Array[Int] = Array(1, 2, 10, 20, 30, 6, 7, 8, 9, 10)

改变数组中元素的值会改变view，改变view中对应数据元素的值一样会改变数组元素值。当你想要修改一个集合子集的元素时，给集合建立一个view而后修改对应的元素是一个很是好的方法来实现这个目标。

最后须要注意的是，不要错误的认为使用view能够节省内存。下面这两个行为会抛出一个“java.lang.OutOfMemoryError:Java heap space”错误信息：

scala> val a = Array.range(0,123456789)
java.lang.OutOfMemoryError: Java heap space

scala> val a = Array.range(0,123456789).view
java.lang.OutOfMemoryError: Java heap space

最后说一句，视图就是推迟执行，该用多大内存还使用多大内存，该遍历多少元素仍是遍历多少元素。说白了scala视图就跟数据库视图同样，不使用视图就跟数据库创建临时表同样。使用视图，当原始集合改变的时候，不须要从新跑transformers方法，使用视图则每次使用视图的时候都会跑一次transformers方法内容。

scala> def compare(x: Int): Boolean = {
     |   println(s"compare $x and 5")
     |   return x < 5
     | }
compare: (x: Int)Boolean

scala> val l = List(1,2,3,4,5,6,7,8,9).view.filter(x => compare(x))
l: scala.collection.SeqView[Int,List[Int]] = SeqViewF(...)

scala> l.map(_ * 2)
res80: scala.collection.SeqView[Int,Seq[_]] = SeqViewFM(...)

scala> l.map(_ * 2).force
compare 1 and 5
compare 2 and 5
compare 3 and 5
compare 4 and 5
compare 5 and 5
compare 6 and 5
compare 7 and 5
compare 8 and 5
compare 9 and 5
res82: Seq[Int] = List(2, 4, 6, 8)