如何解决将DataArray转换为DataFrame
是否有一种简单的方法可以将xarray DataArray转换为pandas DataFrame,我可以在其中规定将哪些尺寸转换为索引/列?例如,假设我有一个DataArray
import xarray as xr
weather = xr.DataArray(
name='weather',data=[['Sunny','Windy'],['Rainy','Foggy']],dims=['date','time'],coords={
'date': ['Thursday','Friday'],'time': ['Morning','Afternoon'],}
)
结果为:
<xarray.DataArray 'weather' (date: 2,time: 2)>
array([['Sunny',dtype='<U5')
Coordinates:
* date (date) <U8 'Thursday' 'Friday'
* time (time) <U9 'Morning' 'Afternoon'
假设我现在想将其移动到按日期和时间列索引的pandas DataFrame。我可以通过在结果数据帧上使用.to_dataframe()
然后使用.unstack()
来做到这一点:
>>> weather.to_dataframe().unstack()
weather
time Afternoon Morning
date
Friday Foggy Rainy
Thursday Windy Sunny
但是,大熊猫将对事物进行排序,而不是早上,然后是下午,我得到了下午,然后是早晨。我本来希望会有像这样的API
weather.to_dataframe(index_dims=[...],column_dims=[...])
这可以为我进行此重塑,而无需在以后重新排序索引和列。
解决方法
在xarray 0.16.1中,dim_order
已添加到.to_dataframe
中。这符合您的需求吗?
xr.DataArray.to_dataframe(
self,name: Hashable = None,dim_order: List[Hashable] = None,) -> pandas.core.frame.DataFrame
Docstring:
Convert this array and its coordinates into a tidy pandas.DataFrame.
The DataFrame is indexed by the Cartesian product of index coordinates
(in the form of a :py:class:`pandas.MultiIndex`).
Other coordinates are included as columns in the DataFrame.
Parameters
----------
name
Name to give to this array (required if unnamed).
dim_order
Hierarchical dimension order for the resulting dataframe.
Array content is transposed to this order and then written out as flat
vectors in contiguous order,so the last dimension in this list
will be contiguous in the resulting DataFrame. This has a major
influence on which operations are efficient on the resulting
dataframe.
If provided,must include all dimensions of this DataArray. By default,dimensions are sorted according to the DataArray dimensions order.
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。