py4j.protocol.Py4JNetworkError:Java 端的答案为空

如何解决py4j.protocol.Py4JNetworkError:Java 端的答案为空

这是我在 Google Colab 上使用的代码。它一直卡在 model.fit 部分并抛出此异常。我一直无法在任何地方找到任何解决方案。 Colab的内存似乎也变得非常高,开始认为spark nlp库中存在内存泄漏。

import sparknlp
spark = sparknlp.start()

data = spark.read.csv("60days-ofdata.csv",header=True)

from sparknlp.pretrained import PretrainedPipeline
from sparknlp import Finisher
from pyspark.ml import Pipeline

finisher = Finisher().setInputCols(["token","lemmas","pos"])
explain_pipeline_model = PretrainedPipeline("explain_document_ml").model

pipeline = Pipeline() \
    .setStages([
        explain_pipeline_model,finisher
        ])

model = pipeline.fit(data.select('text'))
annotations_finished_df = model.transform(data.select('text'))

remover = StopWordsRemover(inputCol="finished_lemmas",outputCol="filtered")
filtered_df = remover.transform(text_lemmas)
filtered_df.show()

cv = CountVectorizer(inputCol="filtered",outputCol="features")
model = cv.fit(filtered_df.select('filtered')) <--------------------------------error thrown while here
result = model.transform(filtered_df.select('filtered'))

错误

INFO:py4j.java_gateway:Error while receiving.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py",line 1207,in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py",in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception,another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py",line 1033,in send_command
    response = connection.send_command(command)
  File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py",line 1212,in send_command
    "Error while receiving",e,proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
---------------------------------------------------------------------------
Py4JError                                 Traceback (most recent call last)
<ipython-input-8-0caf2f9be8f3> in <module>()
      5 
      6 cv = CountVectorizer(inputCol="filtered",outputCol="features")
----> 7 model = cv.fit(filtered_df.select('filtered'))
      8 result = model.transform(filtered_df.select('filtered'))
      9 result.show()

5 frames
/usr/local/lib/python3.7/dist-packages/py4j/protocol.py in get_return_value(answer,gateway_client,target_id,name)
    334             raise Py4JError(
    335                 "An error occurred while calling {0}{1}{2}".
--> 336                 format(target_id,".",name))
    337     else:
    338         type = answer[1]

Py4JError: An error occurred while calling o538.fit

解决方法

mck 提供了一个很好的答案,我将补充说,为了从 spark-nlp 3.0.0 及更高版本开始解决这个问题,您可以将内存参数传递给 start() 函数,

import threading
import time
from concurrent.futures import ThreadPoolExecutor

def do_task(i):
    time.sleep(2)
    nums.append(i)
    print(f'Thread: {threading.current_thread()} | i = {i}')

executor = ThreadPoolExecutor(max_workers=9)
threads = []
nums = []
for i in range(0,5000000):
    time.sleep(0.5) # this sleep function represents me creating the task for the thread to do
    print(f'About to submit {i}')
    threads.append(executor.submit(do_task,i))

print(f'Thread count: {len(threads)}')
for thread in threads:
    thread.result()

print(f'Nums: {nums}')

在驱动程序中获得 16GB 的 RAM 内存。这可能会解决问题。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?
Java在半透明框架/面板/组件上重新绘画。
Java“ Class.forName()”和“ Class.forName()。newInstance()”之间有什么区别?
在此环境中不提供编译器。也许是在JRE而不是JDK上运行?
Java用相同的方法在一个类中实现两个接口。哪种接口方法被覆盖?
Java 什么是Runtime.getRuntime()。totalMemory()和freeMemory()?
java.library.path中的java.lang.UnsatisfiedLinkError否*****。dll
JavaFX“位置是必需的。” 即使在同一包装中
Java 导入两个具有相同名称的类。怎么处理?
Java 是否应该在HttpServletResponse.getOutputStream()/。getWriter()上调用.close()?
Java RegEx元字符(。)和普通点?