Rasterio仿射变换原理与GIS数据处理实战-代码聚汇网

Rasterio仿射变换原理与GIS数据处理实战

蕙风如薰

1. 理解Rasterio中的仿射变换

在GIS和遥感数据处理中，仿射变换是将像素坐标（行、列）映射到地理坐标（x、y）的关键数学工具。Rasterio作为Python中强大的栅格数据处理库，其transform模块提供了完整的仿射变换实现。

注意：仿射变换不仅用于地理数据，任何需要将图像坐标映射到其他坐标系的场景都可以使用类似原理。

1.1 仿射变换的数学原理

仿射变换的矩阵表示为：

code复制[ x ]   = [a, b, c]   [ col ]
[ y ]     [d, e, f] * [ row ]
                      [ 1  ]

这6个参数的实际含义是：

a：x方向上的像素大小（东西方向分辨率）
b：x轴旋转参数（通常为0）
c：左上角像素中心的x坐标
d：y轴旋转参数（通常为0）
e：y方向上的像素大小（南北方向分辨率，通常为负值）
f：左上角像素中心的y坐标

在实际应用中，我们通常遇到的是"北向上"的栅格数据，此时b和d为0，e为负值。例如，一个分辨率为30米的标准UTM栅格可能具有如下变换：

python复制from affine import Affine
transform = Affine(30.0, 0.0, 500000.0,
                   0.0, -30.0, 4500000.0)

1.2 为什么需要仿射变换？

仿射变换解决了三个核心问题：

位置定位：知道图像中每个像素对应的实际地理位置
数据对齐：将不同来源、不同分辨率的栅格数据对齐到同一坐标系
空间分析：进行距离测量、面积计算等空间运算

在实际项目中，我曾遇到过一个典型场景：需要将无人机拍摄的高分辨率影像与卫星影像对齐。通过正确理解和应用仿射变换，我们成功实现了亚米级的对齐精度。

2. Rasterio中的Transform操作

2.1 基本转换操作

获取和查看变换

python复制import rasterio

with rasterio.open('data.tif') as src:
    print(src.transform)
    # 输出示例：| 30.00, 0.00, 500000.00|
    #           | 0.00,-30.00, 4500000.00|
    #           | 0.00, 0.00, 1.00|

坐标转换

python复制# 像素坐标 → 地理坐标
col, row = 100, 200
x, y = transform * (col, row)

# 地理坐标 → 像素坐标
x, y = 500100.0, 4499800.0
col, row = ~transform * (x, y)  # 使用逆变换

技巧：在进行大量坐标转换时，先计算逆变换(~transform)再重复使用，可以提高性能。

2.2 创建变换的多种方式

方法1：直接指定参数

python复制from affine import Affine
transform = Affine(30.0, 0.0, 500000.0,
                   0.0, -30.0, 4500000.0)

方法2：从GDAL格式创建

python复制transform = Affine.from_gdal(500000.0, 30.0, 0.0,
                             4500000.0, 0.0, -30.0)

方法3：使用Rasterio便捷函数

python复制from rasterio.transform import from_origin

# 已知左上角坐标和分辨率
transform = from_origin(500000.0, 4500000.0, 30.0, 30.0)

2.3 变换的修改与组合

仿射变换可以通过矩阵乘法进行组合：

python复制# 放大2倍
new_transform = transform * Affine.scale(2, 2)

# 平移图像
new_transform = transform * Affine.translation(100, -100)

# 旋转45度（需谨慎使用）
new_transform = transform * Affine.rotation(45)

警告：旋转操作会引入非零的b和d参数，可能导致某些栅格操作变得复杂。在实际项目中，我通常会避免直接旋转栅格，而是考虑在显示或后续处理时应用旋转。

3. 高级应用场景

3.1 窗口操作与变换

处理大型栅格时，我们常使用窗口读取部分数据：

python复制from rasterio.windows import Window

with rasterio.open('large.tif') as src:
    window = Window(1000, 1000, 500, 500)
    win_transform = src.window_transform(window)
    
    # 使用窗口变换保存裁剪后的数据
    profile = src.profile.copy()
    profile.update({
        'height': window.height,
        'width': window.width,
        'transform': win_transform
    })
    
    with rasterio.open('clip.tif', 'w', **profile) as dst:
        dst.write(src.read(window=window))

3.2 重采样与变换更新

改变分辨率时需要相应更新变换：

python复制def resample_raster(input_path, output_path, scale_factor):
    with rasterio.open(input_path) as src:
        # 计算新尺寸
        new_height = int(src.height * scale_factor)
        new_width = int(src.width * scale_factor)
        
        # 更新变换
        new_transform = src.transform * Affine.scale(1/scale_factor, 1/scale_factor)
        
        # 更新元数据
        profile = src.profile.copy()
        profile.update({
            'height': new_height,
            'width': new_width,
            'transform': new_transform
        })
        
        # 执行重采样
        data = src.read(
            out_shape=(src.count, new_height, new_width),
            resampling=Resampling.bilinear
        )
        
        with rasterio.open(output_path, 'w', **profile) as dst:
            dst.write(data)

3.3 栅格对齐实战

将两个不同分辨率的栅格对齐到同一网格：

python复制def align_to_reference(target_path, reference_path, output_path):
    with rasterio.open(reference_path) as ref:
        with rasterio.open(target_path) as src:
            # 准备输出profile
            profile = src.profile.copy()
            profile.update({
                'height': ref.height,
                'width': ref.width,
                'transform': ref.transform,
                'crs': ref.crs
            })
            
            # 创建输出数组
            destination = np.zeros((src.count, ref.height, ref.width), 
                                 dtype=src.dtypes[0])
            
            # 重投影并重采样
            reproject(
                source=src.read(),
                destination=destination,
                src_transform=src.transform,
                src_crs=src.crs,
                dst_transform=ref.transform,
                dst_crs=ref.crs,
                resampling=Resampling.bilinear
            )
            
            with rasterio.open(output_path, 'w', **profile) as dst:
                dst.write(destination)

4. 实战经验与问题排查

4.1 常见问题及解决方案

问题1：变换参数不合理

症状：坐标转换结果明显错误，或栅格显示位置异常。

检查清单：

确认a和e的符号是否正确（通常a>0, e<0）
检查b和d是否接近0（除非确实需要旋转）
验证左上角坐标是否合理

python复制def validate_transform(transform):
    if transform.a <= 0 or transform.e >= 0:
        raise ValueError("分辨率参数异常：a应为正，e应为负")
    if abs(transform.b) > 1e-6 or abs(transform.d) > 1e-6:
        print("警告：检测到非零旋转参数")

问题2：重采样后位置偏移

原因：变换更新与数据重采样不匹配。

解决方案：

确保先计算新变换，再进行重采样
对于整数倍重采样，使用精确的尺度因子

问题3：跨UTM带的数据对齐

挑战：不同UTM带的坐标参考系不同。

解决方案：

统一重投影到一个CRS
使用calculate_default_transform计算中间变换

4.2 性能优化技巧

批量坐标转换：对于大量坐标，使用数组操作而非循环：

python复制cols = np.array([100, 101, 102])
rows = np.array([200, 201, 202])
xs, ys = transform * (cols, rows)

缓存逆变换：重复使用的逆变换应该预先计算：

python复制inv_transform = ~transform
col, row = inv_transform * (x, y)

窗口读取：处理大文件时，使用窗口读取减少内存使用。

4.3 实际项目经验

在最近的一个生态监测项目中，我们需要整合：

30米分辨率的Landsat数据
10米分辨率的Sentinel-2数据
0.5米分辨率的无人机影像

通过精心设计变换链，我们实现了：

所有数据对齐到同一网格
保持原始数据的几何精度
支持多尺度分析

关键步骤包括：

选择适当的基准分辨率（10米）
使用双线性重采样保持平滑过渡
为每个数据源创建精确的变换链

5. 扩展应用：创建规则网格

有时我们需要从头创建规则的地理网格：

python复制def create_grid(left, top, dx, dy, nx, ny):
    """创建规则地理网格
    
    参数：
        left: 左上角x坐标
        top: 左上角y坐标
        dx: x方向分辨率
        dy: y方向分辨率（应为正数，函数内部处理符号）
        nx: x方向像素数
        ny: y方向像素数
    
    返回：
        transform: 仿射变换
        bounds: 边界字典
    """
    transform = Affine(dx, 0.0, left,
                       0.0, -dy, top)  # 注意dy取负
    
    bounds = {
        'left': left,
        'right': left + nx * dx,
        'bottom': top - ny * dy,
        'top': top,
        'width': nx,
        'height': ny
    }
    
    return transform, bounds

使用示例：

python复制# 创建1km网格
transform, bounds = create_grid(
    left=500000, top=4500000,
    dx=1000, dy=1000,
    nx=100, ny=80
)

print(f"网格范围：{bounds}")
print(f"变换矩阵：\n{transform}")

6. 地面控制点(GCPs)处理

对于非规则变换或未校正的影像，可以使用地面控制点：

python复制from rasterio.control import GroundControlPoint

# 创建GCP列表
gcps = [
    GroundControlPoint(row=0, col=0, x=500000, y=4500000, z=0),
    GroundControlPoint(row=0, col=1000, x=503000, y=4500000, z=0),
    GroundControlPoint(row=800, col=0, x=500000, y=4497600, z=0),
    GroundControlPoint(row=800, col=1000, x=503000, y=4497600, z=0)
]

# 写入带GCP的数据
with rasterio.open('ortho.tif') as src:
    profile = src.profile.copy()
    profile.update({'gcps': gcps})
    
    with rasterio.open('with_gcps.tif', 'w', **profile) as dst:
        dst.gcps = (gcps, 'GCPs from field survey')
        dst.write(src.read())

专业建议：当使用GCPs时，考虑使用rasterio.warp.reproject进行精确校正，这通常比简单的仿射变换更能处理复杂变形。