如何解决尝试以本地模型提交spark应用程序,得到以下错误“无法从JAR加载主类”
我正在尝试在本地提交一个Spark应用程序,但是我遇到了错误。
Exception in thread "main" org.apache.spark.SparkException: Cannot load main class from JAR file: at org.apache.spark.deploy.SparkSubmitArguments.error(SparkSubmitArguments.scala:657)
at org.apache.spark.deploy.SparkSubmitArguments.loadEnvironmentArguments(SparkSubmitArguments.scala:221)
at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:116)
at org.apache.spark.deploy.SparkSubmit$$anon$2$$anon$1.<init>(SparkSubmit.scala:907)
at org.apache.spark.deploy.SparkSubmit$$anon$2.parseArguments(SparkSubmit.scala:907)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:81)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
以下我用于提交spark应用程序的命令
spark-submit word_count.py
不确定我到底缺少什么,我们将不胜感激
from pyspark.sql import SparkSession
from pyspark.sql.functions import explode
from pyspark.sql.functions import split
def main():
sparkSession = SparkSession.builder.appName("Word Count").getOrCreate()
sparkSession.sparkContext.setLogLevel("ERROR")
readStream = sparkSession.readStream.format('text').load(path)
print("-------------------------------------------------")
print("Streaming source ready: ",readStream.isStreaming)
readStream.printSchema()
words = readStream.select(explode(split(readStream.value,' ')).alias('word'))
wordCounts = words.groupBy('word').count().orderBy('count')
query = wordCounts.writeStream.outputMode('complete').format('console').start().awaitTermination()
if __name__ == '__main__':
main()
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。