微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

从数据数组中删除特定数据

如何解决从数据数组中删除特定数据

我正在处理具有时间、纬度和经度维度的数据数组。 数据数组如下所示:

print (data)
<xarray.DataArray (lon: 2,lat: 2,time: 48)>
 array([[[9.38898492,6.65535271,3.92192596,1.83168364,9.91812091,9.72198563,0.23416978,............],.......

    [0.38138545,8.66420929,4.62462928,7.95165651,2.06577888,6.0229346,8.26839182,.........]])

 Coordinates:
    * lon      (lon) float64 -99.83 -99.32
    * lat      (lat) float64 42.25 42.21
    * time     (time) datetime64[ns] 2017-06-01 ... 2017-06-01T23:30:00

对于每个小时,在 00 分钟和 30 分钟有两条记录。所以时间维度看起来像:

<xarray.DataArray 'time' (time: 48)>
 array(['2017-06-01T00:00:00.000000000','2017-06-01T00:30:00.000000000','2017-06-01T01:00:00.000000000','2017-06-01T01:30:00.000000000','2017-06-01T02:00:00.000000000','2017-06-01T02:30:00.000000000','2017-06-01T03:00:00.000000000','2017-06-01T03:30:00.000000000','2017-06-01T04:00:00.000000000','2017-06-01T04:30:00.000000000','2017-06-01T05:00:00.000000000','2017-06-01T05:30:00.000000000','2017-06-01T06:00:00.000000000','2017-06-01T06:30:00.000000000','2017-06-01T07:00:00.000000000','2017-06-01T07:30:00.000000000','2017-06-01T08:00:00.000000000','2017-06-01T08:30:00.000000000','2017-06-01T09:00:00.000000000','2017-06-01T09:30:00.000000000','2017-06-01T10:00:00.000000000','2017-06-01T10:30:00.000000000','2017-06-01T11:00:00.000000000','2017-06-01T11:30:00.000000000','2017-06-01T12:00:00.000000000','2017-06-01T12:30:00.000000000','2017-06-01T13:00:00.000000000','2017-06-01T13:30:00.000000000','2017-06-01T14:00:00.000000000','2017-06-01T14:30:00.000000000','2017-06-01T15:00:00.000000000','2017-06-01T15:30:00.000000000','2017-06-01T16:00:00.000000000','2017-06-01T16:30:00.000000000','2017-06-01T17:00:00.000000000','2017-06-01T17:30:00.000000000','2017-06-01T18:00:00.000000000','2017-06-01T18:30:00.000000000','2017-06-01T19:00:00.000000000','2017-06-01T19:30:00.000000000','2017-06-01T20:00:00.000000000','2017-06-01T20:30:00.000000000','2017-06-01T21:00:00.000000000','2017-06-01T21:30:00.000000000','2017-06-01T22:00:00.000000000','2017-06-01T22:30:00.000000000','2017-06-01T23:00:00.000000000','2017-06-01T23:30:00.000000000'],dtype='datetime64[ns]')

我只想保留每小时 00 分钟记录的数据并删除 30 分钟记录的数据。所以数据会像

print (data2)
<xarray.DataArray (lon: 2,time: 24)>
array([[[9.38898492,.........]])

 Coordinates:
       * lon      (lon) float64 -99.83 -99.32
       * lat      (lat) float64 42.25 42.21
       * time     (time) datetime64[ns] 2017-06-01 ... 2017-06-01T23:00:00

因此,新数据数组(data2)的时间维度将类似于:

array(['2017-06-01T00:00:00.000000000','2017-06-01T23:00:00.000000000'],dtype='datetime64[ns]')

有什么方法可以做到吗?

这是一个重现原始数据的代码

import numpy as np
from datetime import timedelta
import datetime
import xarray as xr

precipitation = 10 * np.random.rand(2,2,24)
lon = [-99.83,-99.32]
lat = [42.25,42.21]
time = np.arange('2017-06-01','2017-06-02',timedelta(minutes=30),dtype='datetime64[ns]')

data =xr.DataArray(
    data=precipitation,dims=["lon","lat","time"],coords=[lon,lat,time]          
            )

谢谢!

解决方法

您可以使用 time 值中的 datetime components 轻松完成此操作:

data2 = data.sel(time=data.time.dt.minute==0)

print(data2.time)

#<xarray.DataArray 'time' (time: 24)>
#array(['2017-06-01T00:00:00.000000000','2017-06-01T01:00:00.000000000',#       '2017-06-01T02:00:00.000000000','2017-06-01T03:00:00.000000000',#       '2017-06-01T04:00:00.000000000','2017-06-01T05:00:00.000000000',#       '2017-06-01T06:00:00.000000000','2017-06-01T07:00:00.000000000',#       '2017-06-01T08:00:00.000000000','2017-06-01T09:00:00.000000000',#       '2017-06-01T10:00:00.000000000','2017-06-01T11:00:00.000000000',#       '2017-06-01T12:00:00.000000000','2017-06-01T13:00:00.000000000',#       '2017-06-01T14:00:00.000000000','2017-06-01T15:00:00.000000000',#       '2017-06-01T16:00:00.000000000','2017-06-01T17:00:00.000000000',#       '2017-06-01T18:00:00.000000000','2017-06-01T19:00:00.000000000',#       '2017-06-01T20:00:00.000000000','2017-06-01T21:00:00.000000000',#       '2017-06-01T22:00:00.000000000','2017-06-01T23:00:00.000000000'],#      dtype='datetime64[ns]')
#Coordinates:
#  * time     (time) datetime64[ns] 2017-06-01 ... 2017-06-01T23:00:00
#
,

您可以使用 resample。 resample 返回 resample object,然后使用 pad 方法

data.resample(time='1H').pad()

O/P
<xarray.DataArray (lon: 2,lat: 2,time: 24)>
array([[[0.93092321,8.9256469,2.0902752,1.46022299,9.63865453,3.06746535,2.84095699,9.4583144,4.81973945,1.85398961,5.6259217,0.73004426,8.48781372,8.67918668,7.19521316,6.67589949,2.07546901,1.4322415,2.13495418,4.37055217,8.85306247,4.43165936,4.0294716,1.69092842],[0.52261575,5.21821873,1.32905263,8.92984526,1.81558321,3.89992125,1.8788682,7.3124596,2.5068265,9.73076981,0.4511222,9.09497158,0.89253979,9.53972274,7.15277816,0.08596348,2.24376496,2.06680292,4.03876723,5.55558076,8.26049985,3.91292107,8.43491467,5.48503772]],[[8.34117163,1.44051784,2.78164548,8.55049381,9.43753831,7.35745785,1.22652596,9.55220335,0.99754358,9.3994966,7.92541645,2.68894144,9.61408994,7.34960423,2.74209431,4.19041801,8.92849725,9.98010787,9.16994776,4.75409515,3.10524118,5.12308453,8.61494954,1.63399851],[1.02355383,5.64350097,5.76928407,2.76870009,6.86109118,9.1430836,1.81166855,3.19906641,2.28457262,5.30030649,2.86022039,5.46551606,0.62270996,7.86203301,3.38400052,5.22623667,5.49521413,6.26552406,0.93926924,7.98750356,6.72156675,9.5673477,3.03319399,9.71812105]]])
Coordinates:
  * time     (time) datetime64[ns] 2017-06-01 ... 2017-06-01T23:00:00
  * lon      (lon) float64 -99.83 -99.32
  * lat      (lat) float64 42.25 42.21

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。