spark(三)-spark算子

• RDD(Resilient Distributed Dataset )• 五大特性: – A list of partitions – A function for computing each partition – A list of dependencies on other RDDs – Optionally, a Partitioner for key-value RDDs• shu
相关文章
相关标签/搜索