微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何在dask.dataframe.where中指定元数据?

如何解决如何在dask.dataframe.where中指定元数据?

我正在尝试解决一个简单的问题,但是卡在此元数据问题中。我正在处理的问题是数据,应用了功能工具后,数据扩展到了30+ GB。我想做以下完全相同的操作,但是卡住了。

请向我解释该元数据究竟是什么,为什么需要它以及如何指定它。请帮助我指定以下问题。

>>> df
   A  B  C
0  1  2  3
1  4  5  6
2  7  8  9
>>> import dask.dataframe as dd
>>> cor = dd.from_pandas(df.corr(),npartitions = 3)

>>> matrix = np.triu(np.ones((cor.shape[1],cor.shape[1])),k = 1).astype(np.bool)
>>> matrix
array([[False,True,True],[False,False,False]])
>>> cor.where(matrix)

Traceback (most recent call last):
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\dask\dataframe\utils.py",line 174,in raise_on_Meta_error
    yield
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\dask\dataframe\core.py",line 5139,in _emulate
    return func(*_extract_Meta(args,True),**_extract_Meta(kwargs,True))
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\dask\utils.py",line 895,in __call__
    return getattr(obj,self.method)(*args,**kwargs)
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\pandas\core\generic.py",line 8919,in where
    cond,other,inplace,axis,level,errors=errors,try_cast=try_cast
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\pandas\core\generic.py",line 8661,in _where
    raise ValueError("Array conditional must be same shape as self")
ValueError: Array conditional must be same shape as self

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>",line 1,in <module>
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\dask\dataframe\core.py",line 2348,in where
    return map_partitions(M.where,self,cond,enforce_Metadata=False)
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\dask\dataframe\core.py",line 5189,in map_partitions
    Meta = _emulate(func,*args,udf=True,**kwargs)
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\dask\dataframe\core.py",True))
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\contextlib.py",line 130,in __exit__
    self.gen.throw(type,value,traceback)
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\dask\dataframe\utils.py",line 195,in raise_on_Meta_error
    raise ValueError(msg) from e
ValueError: Metadata inference Failed in `where`.

You have supplied a custom function and dask is unable to
determine the type of output that that function returns.

To resolve this please provide a Meta= keyword.
The docstring of the dask function you ran should have more information.

Original error is below:
------------------------
ValueError('Array conditional must be same shape as self')

Traceback:
---------
  File "C:\Users\Na462\AppData\Local\Programs\Python\python37\lib\site-packages\dask\dataframe\utils.py",in _where
    raise ValueError("Array conditional must be same shape as self")

>>>

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?