如何解决AWS EMR 上的 sparklyr 未开始运行
我在 aws EMR spark 过程中收到此错误:
Attaching package: ‘sparklyr’
The following object is masked from ‘package:stats’:
filter
Error in spark_connect_gateway(gatewayAddress,gatewayPort,sessionId,:
Gateway in localhost:8880 did not respond.
Try running `options(sparklyr.log.console = TRUE)` followed by `sc <- spark_connect(...)` for more debugging info.
Calls: spark_connect ... start_shell -> withCallingHandlers -> spark_connect_gateway
Execution halted
Command exiting with ret '1'
我正在运行的代码是:
library(sparklyr)
Sys.setenv(SPARK_HOME="/usr/lib/spark/")
config <- spark_config()
sc <- spark_connect(master = "yarn",config = config,version = '1.6.2')
我也尝试过使用 yarn-client 而不是 yarn 更改版本,但我总是遇到相同的错误。这是 EMR 步骤配置:
"Steps": [
{
"Name": "Pyramid cee","ActionOnFailure": "TERMINATE_CLUSTER","HadoopJarStep": {
"Jar": "s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar","Args": [
"s3://bucket/emr/runner.sh"
]
}
}
]
另外,这是runner.sh的内容
#!/bin/bash
aws s3 cp s3://bucket/emr/sparklyr.R /tmp/sparklyr.R
Rscript /tmp/sparklyr.R
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。