微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

使用 xarray 或 Panda DataFrame 索引对 NetCDF 文件进行适当的时间索引?

如何解决使用 xarray 或 Panda DataFrame 索引对 NetCDF 文件进行适当的时间索引?

我正在尝试对一些浮标数据进行时间索引,以用于每 3 小时重新采样一次并重新索引至一天并根据需要插入数据。我不得不np.squeeze删除纬度和经度数据的变量,因为我认为这是按时间正确索引所需的(纬度和经度是一个常数,因为这是来自海洋中一个浮标的数据)。我已经尝试将熊猫数据帧转换为索引,与 numpy 和 xarray 相同。我相信这是一个简单的修复。我的代码如下:

url = 'https://dods.ndbc.noaa.gov/thredds/fileServer/data/stdmet/46077/46077h2021.nc'
reqSpectra = urllib.request.Request(url)
with urllib.request.urlopen(reqSpectra) as respS:
curFile = xr.open_dataset(io.BytesIO(respS.read()))

# Squeeze variables to remove latitude and longitude for indexing 
tpObs = curFile.variables['dominant_wpd'][:]
tpObs = np.squeeze(tpObs)
hsObs = curFile.variables['wave_height'][:]
hsObs = np.squeeze(tpObs)
mwdobs = curFile.variables['mean_wave_dir'][:]
mwdobs = np.squeeze(tpObs)

# Set time up for indexing
tt = np.array(curFile['time'][:]).astype('datetime64[s]')

# Set Variables up to a data frame with time index
df_tp = pd.DataFrame(tpObs[:],index=tt)
df_hs = pd.DataFrame(hsObs[:],index=tt)
df_mwd = pd.DataFrame(mwdobs[:],index=tt)

# Resample on time interval of interest - i.e. every three hours ="3H"
res_tp = df_tp.set_index(df_tp.index).resample("3H").mean()      
res_hs = df_hs.set_index(df_hs.index).resample("3H").mean()
res_mwd = df_mwd.set_index(df_mwd.index).resample("3H").mean() 

# Reindex on the time range determined by max interval from model vs obs above
rind_tp = res_tp.reindex(timeRange)
rind_hs = res_hs.reindex(timeRange)
rind_mwd = res_mwd.reindex(timeRange)

# Interpolate
interp_tp = rind_tp.interpolate(limit_direction='both')   
interp_hs = rind_hs.interpolate(limit_direction='both')
interp_mwd = rind_mwd.interpolate(limit_direction='both')

它挂在 res_tp 上 - 重新采样到 3 小时线。完整错误如下:

---------------------------------------------------------------------------
DataError                                 Traceback (most recent call last)
<ipython-input-23-87e9174f7e31> in <module>
      1 # Resample on time interval of interest - i.e. every ten minutes='10T'
----> 2 res_tp = df_tp.set_index(df_tp.index).resample("3H").mean()
      3 res_hs = df_hs.set_index(df_hs.index).resample("3H").mean()
      4 res_mwd = df_mwd.set_index(df_mwd.index).resample("3H").mean()
      5 # Reindex on the time range determined by max interval from model vs obs above

~/anaconda3/envs/aoes/lib/python3.6/site-packages/pandas/core/resample.py in g(self,_method,*args,**kwargs)
    935     def g(self,_method=method,**kwargs):
    936         nv.validate_resampler_func(_method,args,kwargs)
--> 937         return self._downsample(_method)
    938 
    939     g.__doc__ = getattr(GroupBy,method).__doc__

~/anaconda3/envs/aoes/lib/python3.6/site-packages/pandas/core/resample.py in _downsample(self,how,**kwargs)
   1041         # we are downsampling
   1042         # we want to call the actual grouper method here
-> 1043         result = obj.groupby(self.grouper,axis=self.axis).aggregate(how,**kwargs)
   1044 
   1045         result = self._apply_loffset(result)

~/anaconda3/envs/aoes/lib/python3.6/site-packages/pandas/core/groupby/generic.py in aggregate(self,func,engine,engine_kwargs,**kwargs)
    949         func = maybe_mangle_lambdas(func)
    950 
--> 951         result,how = self._aggregate(func,**kwargs)
    952         if how is None:
    953             return result

~/anaconda3/envs/aoes/lib/python3.6/site-packages/pandas/core/base.py in _aggregate(self,arg,**kwargs)
    305 
    306         if isinstance(arg,str):
--> 307             return self._try_aggregate_string_function(arg,**kwargs),None
    308 
    309         if isinstance(arg,dict):

~/anaconda3/envs/aoes/lib/python3.6/site-packages/pandas/core/base.py in _try_aggregate_string_function(self,**kwargs)
    261         if f is not None:
    262             if callable(f):
--> 263                 return f(*args,**kwargs)
    264 
    265             # people may try to aggregate on a non-callable attribute

~/anaconda3/envs/aoes/lib/python3.6/site-packages/pandas/core/groupby/groupby.py in mean(self,numeric_only)
   1396             "mean",1397             alt=lambda x,axis: Series(x).mean(numeric_only=numeric_only),-> 1398             numeric_only=numeric_only,1399         )
   1400 

~/anaconda3/envs/aoes/lib/python3.6/site-packages/pandas/core/groupby/generic.py in _cython_agg_general(self,alt,numeric_only,min_count)
   1020     ) -> DataFrame:
   1021         agg_blocks,agg_items = self._cython_agg_blocks(
-> 1022             how,alt=alt,numeric_only=numeric_only,min_count=min_count
   1023         )
   1024         return self._wrap_agged_blocks(agg_blocks,items=agg_items)

~/anaconda3/envs/aoes/lib/python3.6/site-packages/pandas/core/groupby/generic.py in _cython_agg_blocks(self,min_count)
   1128 
   1129         if not (agg_blocks or split_frames):
-> 1130             raise DataError("No numeric types to aggregate")
   1131 
   1132         if split_items:

DataError: No numeric types to aggregate

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。