二进制all包多为spark2 scala2.11的因此源码编译本地相关版本兼容的包的及其它hadoop hive yarn 版本,源码git下载编译排错见前边文章,下为编译合适版本后的安装过程:html
1.zeppelin081/conf/zeppelin-env.sh:java
export MASTER=local[2] #yarn-client #export SCALA_HOME=/usr/share/scala export SCALA_HOME=/opt/soft/scala-2.10.5 export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive #export SPARK_HOME=/opt/cloudera/parcels/SPARK2/lib/spark2 export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop if [ -n "$HADOOP_HOME" ]; then export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${HADOOP_HOME}/lib/native fi #export SPARK_CONF_DIR=/etc/spark2/conf export SPARK_CONF_DIR=/etc/spark/conf export HIVE_CONF_DIR=/etc/hive/conf export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/etc/hadoop/conf} HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-$SPARK_CONF_DIR/yarn-conf} HIVE_CONF_DIR=${HIVE_CONF_DIR:-/etc/hive/conf} if [ -d "$HIVE_CONF_DIR" ]; then HADOOP_CONF_DIR="$HADOOP_CONF_DIR:$HIVE_CONF_DIR" fi export HADOOP_CONF_DIR export ZEPPELIN_INTP_CLASSPATH_OVERRIDES=/etc/hive/conf #export ZEPPELIN_INTP_CLASSPATH_OVERRIDES=:/etc/hive/conf:/usr/share/java/mysql-connector-java.jar:/opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly-1.6.0-cdh5.15.0-hadoop2.6.0-cdh5.15.0.jar:/opt/cloudera/parcels/CDH/jars/*:/opt/cloudera/parcels/CDH/lib/hive/lib/*:/opt/soft/zeppelin081/interpreter/spark/spark-interpreter-0.8.1.jar
2.ln -s /etc/hive/conf/hive-site.xml conf/mysql
3.修改conf/zeppelin-site.xml 的启动端口号git
4.bin/zeppelin-daemon.sh restart 启动 ,自动生成相关log run 和webapp目录web
5.看日志报错:sql
vi logs/zeppelin-root-master.log:apache
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common
/collect/Queuesjson
Caused by: java.lang.ClassNotFoundException: com.google.common.collect.Queues
解决:替换相关guava包 对应CDH lib目录相关版本api
cp /opt/cloudera/parcels/CDH/lib/hive/lib/guava-14.0.1.jar lib/ruby
还报错要guava-21
vi logs/zeppelin-root-master.out:
MultiException[java.lang.NoClassDefFoundError: com/fasterxml/jackson/core/Ve
rsioned, java.lang.NoClassDefFoundError: org/glassfish/jersey/jackson/intern
al/jackson/jaxrs/json/JacksonJaxbJsonProvider]
解决:替换相关jackson 包对应CDH lib目录相关版本
ls lib/|grep jackson
google-http-client-jackson-1.23.0.jar
google-http-client-jackson2-1.23.0.jar
jackson-annotations-2.8.0.jar.bak
jackson-core-2.8.10.jar.bak
jackson-core-asl-1.9.13.jar
jackson-databind-2.8.11.1.jar.bak
jackson-jaxrs-1.8.8.jar
jackson-mapper-asl-1.9.13.jar
jackson-module-jaxb-annotations-2.8.10.jar.bak
jackson-xc-1.8.8.jar
jersey-media-json-jackson-2.27.jar
[root@master zeppelin081]# cp /opt/cloudera/parcels/CDH/jars/jackson-annotations-2.1.0.jar lib/
[root@master zeppelin081]# cp /opt/cloudera/parcels/CDH/jars/jackson-core-2.1.0.jar lib/
[root@master zeppelin081]# cp /opt/cloudera/parcels/CDH/jars/jackson-databind-2.1.0.jar lib/
[root@master zeppelin081]# cp /opt/cloudera/parcels/CDH/jars/jackson-module-jaxb-annotations-2.1.0.jar lib/
试多版本都不行后查要scala版:
cp /opt/cloudera/parcels/CDH/jars/jackson*2.2.3*.jar lib/
[root@master zeppelin081]# ls lib/jackson-
jackson-annotations-2.1.0.jar.bak
jackson-annotations-2.2.2.jar.bak
jackson-annotations-2.2.3.jar
jackson-annotations-2.3.1.jar.bak
jackson-annotations-2.8.0.jar.bak
jackson-core-2.1.0.jar.bak
jackson-core-2.2.2.jar.bak
jackson-core-2.2.3.jar
jackson-core-2.8.10.jar.bak
jackson-core-asl-1.9.13.jar
jackson-databind-2.1.0.jar.bak
jackson-databind-2.2.2.jar.bak
jackson-databind-2.2.3.jar
jackson-databind-2.8.11.1.jar.bak
jackson-jaxrs-1.8.8.jar
jackson-mapper-asl-1.9.13.jar
jackson-module-jaxb-annotations-2.1.0.jar.bak
jackson-module-jaxb-annotations-2.8.10.jar.bak
jackson-module-scala_2.10-2.2.3.jar
jackson-xc-1.8.8.jar
终于搞定!!!
======而后开始打通测试各插件=======
spark interpreter:
master yarn-client
Dependencies
artifact | exclude |
---|---|
/usr/share/java/mysql-connector-java.jar | |
/opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly-1.6.0-cdh5.15.0-hadoop2.6.0-cdh5.15.0.jar |
sc.getConf.toDebugString.split("\n").foreach(println) sqlContext.sql("show tables").show
%sql select area,count(cid) from default.dimcity group by area
presto interpreter:(new jdbc)
default.driver com.facebook.presto.jdbc.PrestoDriver
default.url jdbc:presto://master:19000/hive/
default.user root
Dependencies
artifact | exclude |
---|---|
com.facebook.presto:presto-jdbc:0.100 |
%presto -- SHOW SCHEMAS select area,count(cid) from default.dimcity group by area
phoenix interpreter:(new jdbc)
default.driver org.apache.phoenix.jdbc.PhoenixDriver
default.url jdbc:phoenix:master:2181:/hbase
default.user hdfs
Dependencies
artifact | exclude |
---|---|
org.apache.phoenix:phoenix-core:4.7.0-HBase-1.1 | |
org.apache.phoenix:phoenix-server-client:4.7.0-HBase-1.1 |
%phoenix -- !tables -- 不支持 -- SHOW SCHEMAS -- 不支持 select * from SYSTEM.CATALOG --dim_channels --tc_district //HBASE 表名 必须大写才支持
hbase interpreter:
hbase.home | /opt/cloudera/parcels/CDH/lib/hbase |
hbase.ruby.sources | lib/ruby |
zeppelin.hbase.test.mode | false |
Dependencies (因zepplin自动编译为hbase1.0,要不指定版本重编,要不加载下边包覆盖)
artifact | exclude |
---|---|
/opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1.2.0-cdh5.15.0.jar | |
/opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-common-1.2.0-cdh5.15.0.jar | |
/opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-1.2.0-cdh5.15.0.jar |
%hbase desc 'car_brand' list
elasticsearch interpreter: 参http://cwiki.apachecn.org/pages/viewpage.action?pageId=10030782 默认transport 9300
elasticsearch.client.type http
elasticsearch.cluster.name tuanchees
elasticsearch.host 172.16.60.182
elasticsearch.port 9200
%elasticsearch search /
file interpreter:
hdfs.maxlength 1000
hdfs.url http://master:50070/webhdfs/v1/
hdfs.user
Dependencies
artifact exclude
/opt/cloudera/parcels/CDH/jars/jersey-client-1.9.jar
/opt/cloudera/parcels/CDH/jars/jersey-core-1.9.jar
/opt/cloudera/parcels/CDH/jars/jersey-guice-1.9.jar
/opt/cloudera/parcels/CDH/jars/jersey-server-1.9.jar
/opt/cloudera/parcels/CDH/jars/jersey-json-1.9.jar
%file ls /
flink interpreter:
host localhost
port 6123
%flink val text = benv.fromElements("In the time of chimpanzees, I was a monkey", // some lines of text to analyze "Butane in my veins and I'm out to cut the junkie", "With the plastic eyeballs, spray paint the vegetables", "Dog food stalls with the beefcake pantyhose", "Kill the headlights and put it in neutral", "Stock car flamin' with a loser in the cruise control", "Baby's in Reno with the Vitamin D", "Got a couple of couches, sleep on the love seat", "Someone came in sayin' I'm insane to complain", "About a shotgun wedding and a stain on my shirt", "Don't believe everything that you breathe", "You get a parking violation and a maggot on your sleeve", "So shave your face with some mace in the dark", "Savin' all your food stamps and burnin' down the trailer park", "Yo, cut it") val counts = text.flatMap{ _.toLowerCase.split("\\W+") }.map { (_,1) }.groupBy(0).sum(1) counts.collect().foreach(println(_)) // // Streaming Example // case class WordWithCount(word: String, count: Long) // val text = env.socketTextStream(host, port, '\n') // val windowCounts = text.flatMap { w => w.split("\\s") } // .map { w => WordWithCount(w, 1) } // .keyBy("word") // .timeWindow(Time.seconds(5)) // .sum("count") // windowCounts.print() // // Batch Example // case class WordWithCount(word: String, count: Long) // val text = env.readTextFile(path) // val counts = text.flatMap { w => w.split("\\s") } // .map { w => WordWithCount(w, 1) } // .groupBy("word") // .sum("count") // counts.writeAsCsv(outputPath)
=========spark-notebook============
spark-notebook相对简单下载解压Scala [2.10.5] Spark [1.6.0] Hadoop [2.6.0] {Hive ✓} {Parquet ✓}
同样链接 hive-site.xml: ln -s /etc/hive/conf/hive-site.xml conf/
改端口:vi conf/application.ini
而后能够彻底不动直接启后在改配置,但方便重启,写了个脚本
bin/start.sh #!/bin/bash export MASTER=local[2] #yarn-client #export SCALA_HOME=/usr/share/scala export SCALA_HOME=/opt/soft/scala-2.10.5 export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop if [ -n "$HADOOP_HOME" ]; then export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${HADOOP_HOME}/lib/native fi export SPARK_CONF_DIR=/etc/spark/conf export HIVE_CONF_DIR=/etc/hive/conf export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/etc/hadoop/conf} HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-$SPARK_CONF_DIR/yarn-conf} HIVE_CONF_DIR=${HIVE_CONF_DIR:-/etc/hive/conf} if [ -d "$HIVE_CONF_DIR" ]; then HADOOP_CONF_DIR="$HADOOP_CONF_DIR:$HIVE_CONF_DIR" fi export HADOOP_CONF_DIR workdir=/opt/soft/spark-notebook kill -9 `cat ${workdir}/RUNNING_PID` rm -rf ${workdir}/derby.log ${workdir}/metastore_db ${workdir}/RUNNING_PID ${workdir}/bin/spark-notebook > snb.log 2>&1 &
开始一直连不上HIVE ,后来配置notebook metadata以下:(notebook metadata的配置参考http://master151:9002/assets/docs/clusters_clouds.html)
{ "name": "test", "user_save_timestamp": "1970-01-01T08:00:00.000Z", "auto_save_timestamp": "1970-01-01T08:00:00.000Z", "language_info": { "name": "scala", "file_extension": "scala", "codemirror_mode": "text/x-scala" }, "trusted": true, "customLocalRepo": null, "customRepos": null, "customDeps": null, "customImports": [ "import scala.util._", "import org.apache.spark.SparkContext._" ], "customArgs": null, "customSparkConf": { "spark.master": "local[2]", "hive.metastore.warehouse.dir": "/user/hive/warehouse", "hive.metastore.uris": "thrift://master:9083", "spark.sql.hive.metastore.version": "1.1.0", "spark.sql.hive.metastore.jars": "/opt/cloudera/parcels/CDH/lib/hadoop/../hive/lib/*", "hive.metastore.schema.verification": "false", "spark.jars": "/usr/share/java/mysql-connector-java.jar,/opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly-1.6.0-cdh5.15.0-hadoop2.6.0-cdh5.15.0.jar", "spark.driver.extraClassPath": "/etc/spark/conf:/etc/spark/conf/yarn-conf:/etc/hadoop/conf:/etc/hive/conf:/opt/cloudera/parcels/CDH/lib/hadoop/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-hdfs/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-mapreduce/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-yarn/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hive/lib/*:/opt/cloudera/parcels/CDH/jars/*:/opt/soft/spark-notebook/lib/*", "spark.executor.extraClassPath": "/etc/spark/conf:/etc/spark/conf/yarn-conf:/etc/hadoop/conf:/etc/hive/conf:/opt/cloudera/parcels/CDH/lib/hadoop/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-hdfs/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-mapreduce/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-yarn/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hive/lib/*:/opt/cloudera/parcels/CDH/jars/*:/opt/soft/spark-notebook/lib/*" }, "kernelspec": { "name": "spark", "display_name": "Scala [2.10.5] Spark [1.6.0] Hadoop [2.6.0] {Hive ✓} {Parquet ✓}" } }
报错:
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/AlreadyExistsException
经查原来spark-notebook编绎时自动使用HIVE1.2 metastore,cdh使用的1.1,版本不兼容