本文详细阐述如何搭建Spark集群,并分别使用Maven和SBT编译部署Spark,最后在CentOS中安装IDEA进行开发。html
CentOS 7java
JDK:1.8linux
Spark:2.0.0、2.2.0apache
Scala:2.11.8ide
Hadoop: 2.7oop
Maven:3.3.9ui
如下全部步骤均亲自完成并详细截图!(本文为原创,转载请标明本文地址,谢谢!)idea
本文承接上文:Spark集群环境搭建+Maven、SBT编译部署+IDEA开发(一)spa
下载:https://www.jetbrains.com/idea/download/#section=linuxscala
[spark@master ~]$ mv /home/spark/Downloads/ideaIC-2017.3.4.tar.gz /spark/soft/ [spark@master ~]$ cd /spark/soft/ [spark@master soft]$ tar -zxf ideaIC-2017.3.4.tar.gz
启动IDEA
[spark@master soft]$ cd idea-IC-173.4548.28/ [spark@master idea-IC-173.4548.28]$ ./bin/idea.sh
ok-接受协议-ok-选择皮肤等配置-进入
选择左边第二个scala,我第一次就选错了o(╥﹏╥)o
完成后重启
添加jar()
添加java Library:选择$SPARK_HOME/jar
/spark/work/mvn1/spark-2.0.0-bin-dev/jars
建立目录
src/main/scala
结果多是
也多是
建立WordCount.scala
输入WordCount,选择Object
获得
写入程序代码
package main.scala import org.apache.spark.{SparkConf, SparkContext} object WordCount { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("WordCount") val sc = new SparkContext(conf) val rdd = sc.textFile(args(0)) val wordCount = rdd.flatMap(_.split(" ")).map(x=>(x,1)).reduceByKey(_+_) val wordSort = wordCount.map(x=>(x._2,x._1)).sortByKey(false).map(x=>(x._2,x._1)) wordSort.saveAsTextFile(args(1)) sc.stop() } }
能够看出来这段代码是单词计数的,接下来要配置一下被统计文件的路径
Run-Edit Configuration
填写输入和输出的路径,空格分割
file:///spark/work/mvn1/spark-2.0.0-bin-dev/README.md file:///home/spark/IdeaProjects/firsttest/src/main/scala
接下来运行方式多样
1直接运行
Run-Run...
选择运行WordCount
2打包运行
Build-Build Artifacts...
Build后,将打包好的文件移到Spark目录下
未完待续。。。