使用 Python 将栅格从 CRS 重新投影到另一个栅格的正确方法是什么？

如何解决使用 Python 将栅格从 CRS 重新投影到另一个栅格的正确方法是什么？

我有一个土地覆盖数据栅格（特别是 https://finder.creodias.eu 中的 /eodata/auxdata/S2GLC/2017/S2GLC_T32TMS_2017），它使用“epsg:32632”作为 CRS。我想在“epsg:21781”上重新投影这个光栅。这是我用 xarray 打开光栅时的样子。

fn = 'data/S2GLC_T32TMS_2017/S2GLC_T32TMS_2017.tif'
da = xr.open_rasterio(fn).sel(band=1,drop=True)
da
<xarray.DataArray (y: 10980,x: 10980)>
[120560400 values with dtype=uint8]
Coordinates:
  * y        (y) float64 5.2e+06 5.2e+06 5.2e+06 ... 5.09e+06 5.09e+06 5.09e+06
  * x        (x) float64 4e+05 4e+05 4e+05 ... 5.097e+05 5.097e+05 5.098e+05
Attributes:
    transform:      (10.0,0.0,399960.0,-10.0,5200020.0)
    crs:            +init=epsg:32632
    res:            (10.0,10.0)
    is_tiled:       0
    nodatavals:     (nan,)
    scales:         (1.0,)
    offsets:        (0.0,)
    AREA_OR_POINT:  Area
    INTERLEAVE:     BAND

我通常的工作流程是转换所有点坐标，创建目标网格并使用最近的邻居进行插值。看起来像这样的东西：

import numpy as np
import xarray as xr
import pyproj
from scipy.interpolate import griddata 

y = da.y.values
x = da.x.values
xx,yy = np.meshgrid(x,y)
# (n,2) point coordinates in the original CRS
src_coords = np.column_stack([xx.flatten(),yy.flatten()]) 

transformer = pyproj.transformer.Transformer.from_crs('epsg:32632','epsg:21781')
xx,yy = transformer.transform(src_coords[:,0],src_coords[:,1])
# (n,2) point coordinates in the destination CRS,which are not on a regular grid
dst_coords = np.column_stack([xx.flatten(),yy.flatten()]) 


# I define my destination **regular** grid coordinates
x = np.linspace(620005,719995,10)
y = np.linspace(199995,100005,10)
xx,y)
dst_grid = np.column_stack([xx.flatten(),yy.flatten()])

# I interpolate onto the grid
reprojected_array = griddata(
    src_coords,da.values.flatten(),dst_coords,method='nearest'
).reshape(dst_shape)

尽管此方法相当透明且（显然）没有错误，但在处理数十亿个点时可能需要很长时间。最近，我发现了 rasterio 的 reproject 函数，它的速度之快令我震惊。我是这样实现的：

source = da.values
destination = np.zeros(dst_shape,np.int16)

res,aff = reproject(
    source,destination,src_transform=src_transform,# affine transformation from original data
    src_crs=src_crs,dst_transform=dst_transform,# affine transformation that corresponds to the grid defined in the other approach
    dst_crs=dst_crs,resampling=Resampling.nearest) # using nearest neighbors just like with scope's griddata

当然，我想比较期望它们相同的结果，但事实并非如此，如图所示。

分辨率为 10 米，所以差异不大，但仔细对比了 'epsg:21781' 坐标中的精确卫星数据后，看起来旧方法产生了更好的结果。

所以我的问题是：

为什么这些结果不同？
一种方法比另一种更好吗？是否有特定条件让一个人选择其中一个？

解决方法

不是一个完整的答案（还没有运行你的大 .tif），但一些可能的线索：

询问gis.stackexchange
需要一个更小的测试用例：在您绘制的 20 x 20 子集或 20 x 1000 上，这两种方式是否不同？
如果某些目的地点正好在两个源点之间，像单位网格上的 (0.5,0.5) 或 (0.5,0.5 + epsilon) ？那么这两种方式可以选择不同的“最近”。（我不知道 epsilon / 最近点引擎光栅使用什么。）
要准确查看哪些点是“最近的”，请将行/列编码为
da.values = np.arange( nr * nc ).reshape( nr,nc ) .
8 月 11 日添加：变换像素中心或像素角？文档为 rasterio.transform xy() 表示“默认返回像素的中心，但可以返回一个角”；文档为 rasterio.warp.reproject 没有提到像素大小。
光栅有一个未解决的问题： warp.reproject() generate the wrong result，与 GDAL 不同——不知道是否相关。

使用 Python 将栅格从 CRS 重新投影到另一个栅格的正确方法是什么？

如何解决使用 Python 将栅格从 CRS 重新投影到另一个栅格的正确方法是什么？

解决方法

相关推荐