Spark Streaming初试html
yum install nc.x86_64git
./bin/spark-shell --total-executor-cores 34 github
敲入下面程序:shell
import org.apache.spark.SparkConfapache
import org.apache.spark.streaming.{Seconds, StreamingContext}socket
import org.apache.spark.storage.StorageLevelide
val ssc = new StreamingContext(sc, Seconds(1))ui
val lines = ssc.socketTextStream("hostname", 9999, StorageLevel.MEMORY_AND_DISK_SER)spa
val words = lines.flatMap(_.split(" "))scala
val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
wordCounts.print()
ssc.start()
ssc.awaitTermination()
在hostname机器,安装nc后,nc -lk 9999
不断输入字符,例如hello world,会在启动spark-shell的机器不断统计每次输入数据的字符频率统计。
参考
http://spark.apache.org/docs/latest/streaming-programming-guide.html
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/NetworkWordCount.scala