我在MAC中运行
Bash脚本.此脚本调用以Scala语言编写的spark方法很多次.我目前正在尝试使用for循环调用此spark方法100,000次.
在运行少量迭代(大约3000次迭代)之后,代码以以下异常退出.
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102) at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:518) at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:547) at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:547) at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:547) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1877) at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:547) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) Exception in thread "dag-scheduler-event-loop" 16/11/22 13:37:32 WARN NioEventLoop: Unexpected exception in the selector loop. java.lang.OutOfMemoryError: Java heap space at io.netty.util.internal.MpscLinkedQueue.offer(MpscLinkedQueue.java:126) at io.netty.util.internal.MpscLinkedQueue.add(MpscLinkedQueue.java:221) at io.netty.util.concurrent.SingleThreadEventExecutor.fetchFromScheduledTaskQueue(SingleThreadEventExecutor.java:259) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:346) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:745) java.lang.OutOfMemoryError: Java heap space at java.util.regex.Pattern.compile(Pattern.java:1047) at java.lang.String.replace(String.java:2180) at org.apache.spark.util.Utils$.getFormattedClassName(Utils.scala:1728) at org.apache.spark.storage.RDDInfo$$anonfun$1.apply(RDDInfo.scala:57) at org.apache.spark.storage.RDDInfo$$anonfun$1.apply(RDDInfo.scala:57) at scala.Option.getorElse(Option.scala:121) at org.apache.spark.storage.RDDInfo$.fromrdd(RDDInfo.scala:57) at org.apache.spark.scheduler.StageInfo$$anonfun$1.apply(StageInfo.scala:87)
解决方法
它的RpcTimeoutException ..所以spark.network.timeout(spark.rpc.asktimeout)可以使用大于默认值来调整,以便处理复杂的工作负载.您可以从这些值开始,并根据您的工作负载进行相应调整.
请参阅 latest
请参阅 latest
spark.network.timeout
120s Default timeout for all network
interactions. This config will be used in place of
spark.core.connection.ack.wait.timeout,
spark.storage.blockManagerSlaveTimeoutMs,
spark.shuffle.io.connectionTimeout,spark.rpc.asktimeout or
spark.rpc.lookupTimeout if they are not configured.
还要考虑增加执行程序内存,即spark.executor.memory,最重要的是检查你的代码,检查是否是进一步优化的候选者.
解决方案:值600基于要求
set by SparkConf: conf.set("spark.network.timeout","600s") set by spark-defaults.conf: spark.network.timeout 600s set when calling spark-submit: --conf spark.network.timeout=600s
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。