Scala - Iterator vs Stream vs View

问题来源

https://stackoverflow.com/questions/5159000/stream-vs-views-vs-iterators

优秀回答

该篇回答被收录到 Scala 文档中的 F&Q 部分。我尝试跟着这篇回答并对照源码部分去翻译,翻译不好多多谅解。

First, they are all non-strict. That has a particular mathematical meaning related to functions, but, basically, means they are computed on-demand instead of in advance.

首先,它们都是非严格(即惰性的)的。每个函数都有其特定的数学含义,但是基本上,其数学含义通常都意味着它们是按需计算而非提前计算。

Stream is a lazy list indeed. In fact, in Scala, a Stream is a List whose tail is a lazy val. Once computed, a value stays computed and is reused. Or, as you say, the values are cached.

Stream确实是一个惰性列表。事实上,在 Scala 中,Streamtail变量为惰性值的列表。一旦开始计算,Stream中的值便保持计算后的状态并被能够被重复使用。或者按照你的说法是,Stream中的值能够被缓存下来。

一篇比较不错的、科普Stream的文章:http://cuipengfei.me/blog/2014/10/23/scala-stream-application-scenario-and-how-its-implemented/

An Iterator can only be used once because it is a traversal pointer into a collection, and not a collection in itself. What makes it special in Scala is the fact that you can apply transformation such as map and filter and simply get a new Iterator which will only apply these transformations when you ask for the next element.

Iterator只能够被使用一次,因为其是一个可遍历的指针存在于集合当中,而非集合本身存在于Iterator中。让其在 Scala 如此特殊的原因在于你能够使用 transformation 算子,如map或者filter,并且很容易地获得一个新的Iterator。需要注意的是,新的Iterator只有通过获取元素的时候才会应用那些 transformation 算子。

Scala used to provide iterators which could be reset, but that is very hard to support in a general manner, and they didn’t make version 2.8.0.

Scala 曾尝试过给那些 iterator 一个可复位的功能,但这很难以一个通用的方式去支持。

Views are meant to be viewed much like a database view. It is a series of transformation which one applies to a collection to produce a “virtual” collection. As you said, all transformations are re-applied each time you need to fetch elements from it.

Views 通常意味着元素需要被观察,类似于数据库中的 view。它是原集合通过一系列的 transformation 算子生成的一个”虚构”的集合。如你所说,每当你需要从原集合中获取数据时,都能够重复应用这些 transformation 算子。

Both Iterator and views have excellent memory characteristics. Stream is nice, but, in Scala, its main benefit is writing infinite sequences (particularly sequences recursively defined). One can avoid keeping all of the Stream in memory, though, by making sure you don’t keep a reference to its head (for example, by using def instead of val to define the Stream).

Iterator和 views 两者都有不错内存(记忆?)特性。Stream也可以,但是在 Scala 中,其主要的好处在于能够保留无限长的序列(特别是那些序列是通过递归定义的[这一点需要通过 Stream 本身特性才能够理解])当中。不过,你可以避免将所有Stream保留在内存中,其方法是确保不保留那些对 Streamhead的引用。

针对最后提到的例子,https://stackoverflow.com/questions/13217222/should-i-use-val-or-def-when-defining-a-stream这篇回答有比较好的解释

Because of the penalties incurred by views, one should usually force it after applying the transformations, or keep it as a view if only few elements are expected to ever be fetched, compared to the total size of the view.

由于 views 所带来不良影响(个人认为是这么翻译的),我们通常需要在应用 transformations 后调用force进行计算,或者说如果相比于原 view 中大量元素,新 view 只有少量的元素需要去获取时,可以将其当做新的 view 对待。