Scala建立SparkStreaming获取Kafka数据代码过程

正文

  首先打开spark官网,找一个本身用版本我选的是1.6.3的,而后进入SparkStreaming   ,经过搜索这个位置找到Kafka,html

  

    点击过去会找到一段Scala的代码    java

     import org.apache.spark.streaming.kafka._

     val kafkaStream = KafkaUtils.createStream(streamingContext, 
       [ZK quorum], [consumer group id], [per-topic number of Kafka partitions to consume])

 

     若是想看createStream方法,能够值经过SparkStreaming中的 Where to go from here 中看到,有Java,Scala,Python的documents选择本身编码的一种点击进去。我这里用的Scala,点击KafkaUtils进去后会看到这个类中有不少的方法,其中咱们要找的是createStream方法,看看有哪些重载。咱们把这个方法的解释赋值过来。apache

    

  defcreateStream(jssc: JavaStreamingContextzkQuorum: String, groupId: String, topics: Map[String, Integer])JavaPairReceiverInputDStream[String, String]

       Create an input stream that pulls messages from Kafka Brokers. Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.api

       jssc

    JavaStreamingContext objectapp

       zkQuorum

    Zookeeper quorum (hostname:port,hostname:port,..)oop

       groupId

    The group id for this consumer学习

       topics

    Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own threadui

       returns

    DStream of (Kafka message key, Kafka message value)this

 

 

    最后咱们在IDEA中写Scala获取Kafka代码编码

    

  def main(args: Array[String]): Unit = {
     val spark = SparkSession.builder()
    .appName(Constants.SPARK_APP_NAME_PRODUCT)
    .getOrCreate()
     val map = Map("topic" -> 1)
     val ssc = new StreamingContext(spark.sparkContext, Seconds(5))
     val createStream: ReceiverInputDStream[(String, String)] = KafkaUtils.createStream(ssc, "hadoop01:9092,hadoop02:9092,hadoop03:9092", "groupId", map, StorageLevel.MEMORY_AND_DISK_SER)
     val map1: DStream[String] = createStream.map(_._2)

  }

  

    

     简答的代码过程,由于还有一些后续的工做要作,因此只是简单的写了一些从Kafa获取数据的代码从官网查找的一个过程,也是怀着学习的态度与你们一块儿交流,但愿大牛们多多指点。

 

            i want to take you to travel ,this is my current mood

相关文章
相关标签/搜索