如何解决无法使用独立的 Spark 集群在 jupyter 上执行 pyspark 客户端
使用以下命令为 apache spark 运行独立模式集群后:
node 1 (192.168.1.10):
./sbin/start-master.sh -h 192.168.1.10
node 2 (192.168.1.11):
./sbin/start-slave.sh spark://192.168.1.10:7077 -m 2g
node 3 (192.168.1.12):
./sbin/start-slave.sh spark://192.168.1.10:7077 -m 2g
我在本地 jupyter notebook 中创建了 pyspark 来测试一些执行:
from pyspark.sql import SparkSession
spark = SparkSession.\
builder.\
appName("pyspark-notebook").\
config('spark.app.name','my_app').\
master("spark://192.168.1.10:7077").\
getorCreate()
运行此命令后,我的工作节点中有这些日志:
21/06/16 16:50:28 INFO Worker: Executor app-20210616161818-0000/646 finished with state EXITED message Command exited with code 1 exitStatus 1
21/06/16 16:50:28 INFO ExternalShuffleBlockResolver: Clean up non-shuffle and non-RDD files associated with the finished executor 646
21/06/16 16:50:28 INFO ExternalShuffleBlockResolver: Executor is not registered (appId=app-20210616161818-0000,execId=646)
21/06/16 16:50:28 INFO Worker: Asked to launch executor app-20210616161818-0000/648 for my_app
21/06/16 16:50:28 INFO SecurityManager: Changing view acls to: root
21/06/16 16:50:28 INFO SecurityManager: Changing modify acls to: root
21/06/16 16:50:28 INFO SecurityManager: Changing view acls groups to:
21/06/16 16:50:28 INFO SecurityManager: Changing modify acls groups to:
21/06/16 16:50:28 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
21/06/16 16:50:28 INFO ExecutorRunner: Launch command: "/usr/local/openjdk-8/bin/java" "-cp" "/usr/bin/spark-3.0.0-bin-hadoop3.2/conf/:/usr/bin/spark-3.0.0-bin-hadoop3.2/jars/*" "-Xmx1024M" "-Dspark.driver.port=43016" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@192.168.1.10:43016" "--executor-id" "648" "--hostname" "192.168.1.12" "--cores" "1" "--app-id" "app-20210616161818-0000" "--worker-url" "spark://Worker@192.168.1.12:33518"
以及工人 ui 上的这些错误
Caused by: java.io.IOException: Failed to connect to localhost/127.0.0.1:44023
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:287)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:204)
at org.apache.spark.rpc.netty.OutBox$$anon$1.call(OutBox.scala:202)
at org.apache.spark.rpc.netty.OutBox$$anon$1.call(OutBox.scala:198)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
这里是否有任何配置错误?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。