类型错误：__dask_distributed_pack__() 需要 3 个位置参数，但给出了 4 个

如何解决类型错误：__dask_distributed_pack__() 需要 3 个位置参数，但给出了 4 个

我有一些代码，可以将 Pandas 数据帧转换为 dask 数据帧，并对行应用一些操作。代码过去工作得很好，但现在似乎由于 dask 引起的一些内部错误而崩溃。有人知道是什么问题吗？

示例：

import dask.dataframe as dd
x = pd.DataFrame(np.ones((4,2)),columns=['a','b'])
df = dd.from_pandas(x,npartitions=2)
df.compute()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-29-2d44a675e56f> in <module>
      3 x = pd.DataFrame(np.ones((4,'b'])
      4 df = dd.from_pandas(x,npartitions=2)
----> 5 df.compute()

~/miniconda3/envs/research_env/lib/python3.9/site-packages/dask/base.py in compute(self,**kwargs)
    282         dask.base.compute
    283         """
--> 284         (result,) = compute(self,traverse=False,**kwargs)
    285         return result
    286 

~/miniconda3/envs/research_env/lib/python3.9/site-packages/dask/base.py in compute(*args,**kwargs)
    564         postcomputes.append(x.__dask_postcompute__())
    565 
--> 566     results = schedule(dsk,keys,**kwargs)
    567     return repack([f(r,*a) for r,(f,a) in zip(results,postcomputes)])
    568 

~/miniconda3/envs/research_env/lib/python3.9/site-packages/distributed/client.py in get(self,dsk,workers,allow_other_workers,resources,sync,asynchronous,direct,retries,priority,fifo_timeout,actors,**kwargs)
   2652         Client.compute : Compute asynchronous collections
   2653         """
-> 2654         futures = self._graph_to_futures(
   2655             dsk,2656             keys=set(flatten([keys])),~/miniconda3/envs/research_env/lib/python3.9/site-packages/distributed/client.py in _graph_to_futures(self,user_priority,actors)
   2579             # Pack the high level graph before sending it to the scheduler
   2580             keyset = set(keys)
-> 2581             dsk = dsk.__dask_distributed_pack__(self,keyset,annotations)
   2582 
   2583             # Create futures before sending graph (helps avoid contention)

TypeError: __dask_distributed_pack__() takes 3 positional arguments but 4 were given

解决方法

代码在我的机器（MacOS Big Sur）上运行良好：

python=3.8.10
pandas=1.2.4
dask=2021.5.0
distributed=2021.5.0
numpy=1.20.3

这是我在我的机器上运行的确切代码：

import pandas as pd
import numpy as np
import dask.dataframe as dd
x = pd.DataFrame(np.ones((4,2)),columns=['a','b'])
df = dd.from_pandas(x,npartitions=2)
df.compute()
#      a    b
# 0  1.0  1.0
# 1  1.0  1.0
# 2  1.0  1.0
# 3  1.0  1.0

也许重新安装模块会有所帮助。有时从头开始重新创建环境会有所帮助，因为 distributed（有时）不会同时更新。

正如@SultanOrazbayev 提到的，distributed 没有更新。安装所有软件包后运行以下行可解决问题。

python -m pip install "dask[distributed]" --upgrade

我最近创建的 conda env 也发生了同样的事情。我发现它是用 distributed=2021.5.0 和 dask=2021.4.0 创建的。我在环境中降级了 distributed。

conda install distributed=2021.4.0

没有更多错误。 :)

在创建新的 Python 环境并将 dask 安装为 conda install dask 后，我遇到了此错误。然后我卸载了它，并重新安装为 conda install dask distributed -c conda-forge （如 dask doc https://distributed.dask.org/en/latest/install.html 中所述）。之后，错误消失了。

类型错误：__dask_distributed_pa​​ck__() 需要 3 个位置参数，但给出了 4 个

如何解决类型错误：__dask_distributed_pa​​ck__() 需要 3 个位置参数，但给出了 4 个

解决方法

相关推荐

类型错误：__dask_distributed_pack__() 需要 3 个位置参数，但给出了 4 个

如何解决类型错误：__dask_distributed_pack__() 需要 3 个位置参数，但给出了 4 个