微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

将 Spark 3.1.1 作为 hive 的 3.1.2 引擎运行时出错 java.lang.NoClassDefFoundError: org/apache/spark/unsafe/array/ByteArrayMethods

如何解决将 Spark 3.1.1 作为 hive 的 3.1.2 引擎运行时出错 java.lang.NoClassDefFoundError: org/apache/spark/unsafe/array/ByteArrayMethods

我在 ubuntu 20.4 上的纱线上运行火花 集群版本:

  • Hadoop 3.2.2
  • 蜂巢 3.1.2
  • Spark 3.1.1

我已经给出了从 spark 的 jar 到 hive 的 lib 的符号链接

sudo ln -s $SPARK_HOME/jars/spark-network-common_2.12-3.1.1.jar $HIVE_HOME/lib/spark-network-common_2.12-3.1.1.jar
sudo ln -s $SPARK_HOME/jars/spark-core_2.12-3.1.1.jar $HIVE_HOME/lib/spark-core_2.12-3.1.1.jar
sudo ln -s $SPARK_HOME/jars/scala-library-2.12.10.jar $HIVE_HOME/lib/scala-library-2.12.10.jar

当运行 hive 并设置 spark 时,我收到以下错误

Failed to execute spark task,with exception 'org.apache.hadoop.hive.ql.Metadata.HiveException(Failed to create Spark client for Spark session 57f08f6b-02b7-4c3d-bf8c-4ec351a5fd34)'
2021-05-31T12:31:58,949 ERROR [a69d446a-f1a0-45d9-8dbc-c0fccbf718b3 main] spark.SparkTask: Failed to execute spark task,with exception 'org.apache.hadoop.hive.ql.Metadata.HiveException(Failed to create Spark client for Spark session 57f08f6b-02b7-4c3d-bf8c-4ec351a5fd34)'
org.apache.hadoop.hive.ql.Metadata.HiveException: Failed to create Spark client for Spark session 57f08f6b-02b7-4c3d-bf8c-4ec351a5fd34
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getHiveException(SparkSessionImpl.java:221)
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:92)
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:115)
        at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:136)
        at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:115)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703)
        at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
        at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: java.lang.NoClassDefFoundError: org/apache/spark/unsafe/array/ByteArrayMethods
        at org.apache.spark.internal.config.package$.<init>(package.scala:1095)
        at org.apache.spark.internal.config.package$.<clinit>(package.scala)
        at org.apache.spark.SparkConf$.<init>(SparkConf.scala:654)
        at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala)
        at org.apache.spark.SparkConf.set(SparkConf.scala:94)
        at org.apache.spark.SparkConf.set(SparkConf.scala:83)
        at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.generateSparkConf(HiveSparkClientFactory.java:265)
        at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:98)
        at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:76)
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:87)
        ... 24 more
Caused by: java.lang.classNotFoundException: org.apache.spark.unsafe.array.ByteArrayMethods
        at java.net.urlclassloader.findClass(urlclassloader.java:382)
        at java.lang.classLoader.loadClass(ClassLoader.java:418)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
        at java.lang.classLoader.loadClass(ClassLoader.java:351)
        ... 34 more

2021-05-31T12:31:58,950 ERROR [a69d446a-f1a0-45d9-8dbc-c0fccbf718b3 main] spark.SparkTask: Failed to execute spark task,with exception 'org.apache.hadoop.hive.ql.Metadata.HiveException(Failed to create Spark client for Spark session 57f08f6b-02b7-4c3d-bf8c-4ec351a5fd34)'
org.apache.hadoop.hive.ql.Metadata.HiveException: Failed to create Spark client for Spark session 57f08f6b-02b7-4c3d-bf8c-4ec351a5fd34
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getHiveException(SparkSessionImpl.java:221) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:92) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:115) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:136) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:115) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) ~[hive-cli-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) ~[hive-cli-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) ~[hive-cli-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) ~[hive-cli-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) ~[hive-cli-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) ~[hive-cli-3.1.2.jar:3.1.2]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_292]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_292]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_292]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_292]
        at org.apache.hadoop.util.RunJar.run(RunJar.java:323) ~[hadoop-common-3.2.2.jar:?]
        at org.apache.hadoop.util.RunJar.main(RunJar.java:236) ~[hadoop-common-3.2.2.jar:?]
Caused by: java.lang.NoClassDefFoundError: org/apache/spark/unsafe/array/ByteArrayMethods
        at org.apache.spark.internal.config.package$.<init>(package.scala:1095) ~[spark-core_2.12-3.1.1.jar:3.1.1]
        at org.apache.spark.internal.config.package$.<clinit>(package.scala) ~[spark-core_2.12-3.1.1.jar:3.1.1]
        at org.apache.spark.SparkConf$.<init>(SparkConf.scala:654) ~[spark-core_2.12-3.1.1.jar:3.1.1]
        at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala) ~[spark-core_2.12-3.1.1.jar:3.1.1]
        at org.apache.spark.SparkConf.set(SparkConf.scala:94) ~[spark-core_2.12-3.1.1.jar:3.1.1]
        at org.apache.spark.SparkConf.set(SparkConf.scala:83) ~[spark-core_2.12-3.1.1.jar:3.1.1]
        at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.generateSparkConf(HiveSparkClientFactory.java:265) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:98) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:76) ~[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:87) ~[hive-exec-3.1.2.jar:3.1.2]
        ... 24 more

我已经下载了 spark 作为 hadoop 3.2.0 及更高版本的预构建,其中 spark jars 包含 hive 2.3.0 jars,hive 为 3.1.2,其中 hive 的 lib 包含 3.1.2 jars

解决方法

Hive on Spark 仅使用特定版本的 Spark 进行测试,因此给定版本的 Hive 只能保证与特定版本的 Spark 配合使用。其他版本的 Spark 可能适用于给定版本的 Hive,但这不能保证。以下是 Hive 版本及其对应的兼容 Spark 版本的列表。

请在Hive Spark Compatibility Chart中查找信息。 enter image description here

请使用 spark 2.3.0 并查看包含以下内容的 POM.xml。

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-unsafe_2.11</artifactId>
<version>2.3.0</version>
<scope>compile</scope>
</dependency>

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。