利用maven打包spark项目,运行生成的jar包(例如:java -jar DataAnalygis.jar hdfs://server1:8020/tasks/files),运行时报如下异常。java
Exception in thread "main" java.lang.RuntimeException: java.io.IOException: No FileSystem for scheme: file
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:657)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:391)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:391)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:111)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:111)
at scala.Option.map(Option.scala:145)apache
解决办法:app
1)检查生成的jar中META-INF->services->org.apache.hadoop.fs.FileSystem文件,该文件中要包含FileSystem的实现。特别是maven
org.apache.hadoop.fs.LocalFileSystem #处理local file scheme的类oop
2)还有一个可能性是检查classpath是否包含hadoop-hdfs.jar,不过这种可能性比较低。通常状况下,在项目中利用maven打包,应该都配置正确的hadoop-client的依赖(dependency),所以这种错误就不是这个状况致使。spa
另外,对于hadoop的jar运行时报这个错,这种解决方法也适用。scala