spark部署及相关配置

spark standalone单机模式:
1、spark standalone单机模式,启用单个master多个worker形式,配置部署。
export SCALA_HOME=/opt/scala
export JAVA_HOME=/usr/local/jdk1.8.0_231

本地安装绑定

export SPARK_MASTER_HOST=master
export SPARK_LOCAL_IP=master
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_MEMORY=64G
#export SPARK_WORKER_PORT=8090
export SPARK_WORKER_CORES=30
export SPARK_WORKER_INSTANCES=4
#export SPARK_EXECUTOR_CORES=12
2、worker对应cpu核数,Executor对应partitionCount。Spark启动配置:–num-executors 7 --executor-cores 6与–executor-memory 6G 一比一比例。num-executors * executor-cores 不超过worker配置的或集群机器的逻辑核。
3、Kafka Suffle read/write效率慢,可以调整:
num.network.threads:Broker处理消息的最大线程数
num.io.threads:Broker处理磁盘IO
export KAFKA_HEAP_OPTS="-Xmx12G -Xms8G -Xmn4G -XX:PermSize=128m -XX:MaxPermSize=256m -XX:SurvivorRatio=6 -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly"
spark集群模式:
软件包:
spark-2.3.1-bin-without-hadoop.tgz
解压:
[[email protected] rtm_soft]$ tar -xf spark-2.3.1-bin-without-hadoop.tgz
[[email protected] rtm_soft]$ mv spark-2.3.1-bin-without-hadoop /opt/spark
进入目录修改配置:cd /opt/spark/conf
vi slaves
在这里插入图片描述
vi spark-env.sh
在这里插入图片描述
Vi log4j.properties
在这里插入图片描述
启动:
更换目录:/opt/spark/sbin
运行启动脚本:
./start-all.sh
停止:stop-all.sh
在这里插入图片描述
查看运行状态:
curl master:8080
spark删除任务:yarn application -kill application_1548989889721_0034

spark-yarn模式启动:
spark-submit --class org.common.Application --name PRO-NAME --master yarn --deploy-mode cluster --driver-cores 4 --driver-memory 4G --num-executors 7 --executor-cores 6 --executor-memory 6G --conf spark.executor.extraJavaOptions="-XX:MaxDirectMemorySize=6G -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -XX:SurvivorRatio=6 -XX:MaxGCPauseMillis=80 -XX:+AlwaysPreTouch -XX:InitiatingHeapOccupancyPercent=40 -XX:ParallelGCThreads=20 -XX:-OmitStackTraceInFastThrow -XX:+UseCompressedOops" /home/app/pro-core.jar
spark-standalone模式启动:
spark-submit --class com.common.Application --name PRO-CORE --master spark://master:7077 --deploy-mode client --driver-cores 4 --driver-memory 4G --num-executors 12 --executor-cores 10 --executor-memory 10G --jars /home/app/pro-core.jar --conf spark.executor.extraJavaOptions="-XX:MaxDirectMemorySize=6G -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -XX:SurvivorRatio=6 -XX:MaxGCPauseMillis=80 -XX:+AlwaysPreTouch -XX:InitiatingHeapOccupancyPercent=40 -XX:ParallelGCThreads=20 -XX:-OmitStackTraceInFastThrow -XX:+UseCompressedOops" /home/app/pro-core.jar
spark提交batch:
spark-submit --class org.common.Application --name PRO-BATCH --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 4G --num-executors 3 --executor-cores 4 --executor-memory 4G --conf spark.dynamicAllocation.maxExecutors=9 /home/app/pro-batch.jar kpiKeShiHuaViewStat

spark webUI port:8080 sparkSubmit任务监测端口:4404 spark静态文件:localhost:4404/static