如何解决快速强大的图像拼接算法，可用于Python中的许多图像吗？

我有一台固定式相机，可以快速拍摄连续移动的产品的照片，但要固定在一个相同角度（平移角度）的位置。我需要将所有图像拼接成一张全景图片。我已经尝试过使用Stitcher类。它起作用了，但是计算却花了很长时间。我还尝试使用另一种方法，方法是使用SIFT检测器FNNbasedMatcher，找到同形，然后使图像变形。如果我仅使用两个图像，则此方法效果很好。对于多张图像，仍然无法正确拼接它们。有谁知道这种情况下最好和最快的图像拼接算法？

这是我的代码，它使用Stitcher类。

import time
import cv2
import os
import numpy as np
import sys

def main():
    # read input images
    imgs = []
    path = 'pics_rotated/'
    i = 0
    for (root,dirs,files) in os.walk(path):
        images = [f for f in files]
        print(images)
        for i in range(0,len(images)):
            curImg = cv2.imread(path + images[i])
            imgs.append(curImg)

    stitcher = cv2.Stitcher.create(mode= 0)
    status,result = stitcher.stitch(imgs)
    if status != cv2.Stitcher_OK:
        print("Can't stitch images,error code = %d" % status)
        sys.exit(-1)
    cv2.imwrite("imagesout/output.jpg",result)
    cv2.waitKey(0)


if __name__ == '__main__':
    start = time.time()
    main()
    end = time.time()
    print("Time --->>>>>",end - start)
    cv2.destroyAllWindows()enter code here

解决方法

简报

虽然 OpenCV Stitcher class 提供了很多方法和选项来执行拼接，但我发现它很难使用，因为它很复杂。因此，我会尽量提供最少最快的拼接方式。如果您想知道更复杂的方法，例如曝光补偿，我强烈建议您查看 the detailed sample code。作为旁注，如果有人可以将以下函数转换为使用 Stitcher 类，我将不胜感激。

简介

为了将多幅图像组合成同一个视角，需要进行以下操作：

检测和匹配特征。
计算单应性（帧之间的透视变换）。
将一个图像扭曲到另一个视角。
结合基础图像和扭曲图像，同时跟踪原点的偏移。
给定组合模式，拼接多个图像。

特征检测与匹配

What are features? 它们是可区分的部分，如正方形的角，在图像中保留。为了获得这些特征点，提出了不同的算法，如 Harris、ORB、SIFT、SURF 等。有关完整列表，请参阅 cv::Feature2d。我将使用 SIFT，因为它准确且足够快。

一个特征由一个 KeyPoint 和一个描述符组成，它是图像中的位置，描述符是一组表示特征属性的数字（例如一个 128 维向量）。

在图像中找到不同的点后，我们需要匹配对应的点对。见cv::DescriptionMatcher。我将使用基于 Flann 的描述符匹配器。

首先，我们初始化描述符和匹配器类。

descriptor = cv.SIFT.create()
matcher = cv.DescriptorMatcher.create(cv.DescriptorMatcher.FLANNBASED)

然后，我们找到每个图像中的特征。

(kps,desc) = descriptor.detectAndCompute(image,mask=None)

现在我们找到了对应的点对。

if (desc1 is not None and desc2 is not None and len(desc1) >=2 and len(desc2) >= 2):
    rawMatch = matcher->knnMatch(desc2,desc1,k=2)
matches = []
# ensure the distance is within a certain ratio of each other (i.e. Lowe's ratio test)
ratio = 0.75
for m in rawMatch:
    if len(m) == 2 and m[0].distance < m[1].distance * ratio:
        matches.append((m[0].trainIdx,m[0].queryIdx))

单应性计算

单应性是从一种视图到另一种视图的透视变换。一个视图中的平行线在另一个视图中可能不平行，就像一条通往日落的路。我们需要至少有 4 个对应的点对。更多意味着必须分解或消除冗余数据。

将初始视图中的点转换为其扭曲位置的单应矩阵。它是一个由 Direct Linear Transform algorithm 计算的 3x3 矩阵。有8个DoF，矩阵的最后一个元素是1。

[pt2] = H * [pt1]

现在我们有了对应的点匹配，我们计算单应性。我们用来处理冗余数据的方法是RANSAC，它随机选择4个点对并使用最佳拟合结果。有关更多选项，请参阅 cv::findHomography。

if len(matches) > 4:
    (H,status) = cv.findHomography(pts1,pts2,cv.RANSAC)

透视变形

通过计算单应性，我们知道源图像中的哪个点对应于目标图像中的哪个点。 为了不丢失源图像中的信息，我们需要根据变换点落在负区域的量来填充目标图像。 同时，我们需要跟踪拼接多幅图像的原点偏移量。

辅助功能

# find the ROI of a transformation result
def warpRect(rect,H):
    x,y,w,h = rect
    corners = [[x,y],[x,y + h - 1],[x + w - 1,y + h - 1]]
    extremum = cv.transform(corners,H)
    minx,miny = np.min(extremum[:,0]),np.min(extremum[:,1])
    maxx,maxy = np.max(extremum[:,np.max(extremum[:,1])
    xo = int(np.floor(minx))
    yo = int(np.floor(miny))
    wo = int(np.ceil(maxx - minx))
    ho = int(np.ceil(maxy - miny))
    outrect = (xo,yo,wo,ho)
    return outrect

# homography matrix is translated to fit in the screen
def coverH(rect,H):
    # obtain bounding box of the result
    x,_,_ = warpRect(rect,H)
    # shift amount to the first quadrant
    xpos = int(-x if x < 0 else 0)
    ypos = int(-y if y < 0 else 0)
    # correct the homography matrix so that no point is thrown out
    T = np.array([[1,xpos],[0,1,ypos],1]])
    H_corr = T.dot(H)
    return (H_corr,(xpos,ypos))

# pad image to cover ROI,return the shift amount of origin
def addBorder(img,rect):
    x,h = rect
    tl = (x,y)    
    br = (x + w,y + h)
    top = int(-tl[1] if tl[1] < 0 else 0)
    bottom = int(br[1] - img.shape[0] if br[1] > img.shape[0] else 0)
    left = int(-tl[0] if tl[0] < 0 else 0)
    right = int(br[0] - img.shape[1] if br[0] > img.shape[1] else 0)
    img = cv.copyMakeBorder(img,top,bottom,left,right,cv.BORDER_CONSTANT,value=[0,0])
    orig = (left,top)
    return img,orig

def size2rect(size):
    return (0,size[1],size[0])

变形功能

def warpImage(img,H):
    # tweak the homography matrix to move the result to the first quadrant
    H_cover,pos = coverH(size2rect(img.shape),H)
    # find the bounding box of the output
    x,h = warpRect(size2rect(img.shape),H_cover)
    width,height = x + w,y + h
    # warp the image using the corrected homography matrix
    warped = cv.warpPerspective(img,H_corr,(width,height))
    # make the external boundary solid black,useful for masking
    warped = np.ascontiguousarray(warped,dtype=np.uint8)
    gray = cv.cvtColor(warped,cv.COLOR_RGB2GRAY)
    _,bw = cv.threshold(gray,255,cv.THRESH_BINARY)
    # https://stackoverflow.com/a/55806272/12447766
    major = cv.__version__.split('.')[0]
    if major == '3':
        _,cnts,_ = cv.findContours(bw,cv.RETR_EXTERNAL,cv.CHAIN_APPROX_NONE)
    else:
        cnts,cv.CHAIN_APPROX_NONE)
    warped = cv.drawContours(warped,0],lineType=cv.LINE_4)
    return (warped,pos)

结合变形图像和目标图像

这是涉及图像增强（例如曝光补偿）的步骤。为了简单起见，我们将使用均值混合。最简单的解决方案是覆盖目标图像中的现有数据，但平均操作对我们来说不是负担。

# only the non-zero pixels are weighted to the average
def mean_blend(img1,img2):
    assert(img1.shape == img2.shape)
    locs1 = np.where(cv.cvtColor(img1,cv.COLOR_RGB2GRAY) != 0)
    blended1 = np.copy(img2)
    blended1[locs1[0],locs1[1]] = img1[locs1[0],locs1[1]]
    locs2 = np.where(cv.cvtColor(img2,cv.COLOR_RGB2GRAY) != 0)
    blended2 = np.copy(img1)
    blended2[locs2[0],locs2[1]] = img2[locs2[0],locs2[1]]
    blended = cv.addWeighted(blended1,0.5,blended2,0)
    return blended

def warpPano(prevPano,img,H,orig):
    # correct homography matrix
    T = np.array([[1,-orig[0]],-orig[1]],1]])
    H_corr = H.dot(T)
    # warp the image and obtain shift amount of origin
    result,pos = warpImage(prevPano,H_corr)
    xpos,ypos = pos
    # zero pad the result
    rect = (xpos,ypos,img.shape[1],img.shape[0])
    result,_ = addBorder(result,rect)
    # mean value blending
    idx = np.s_[ypos : ypos + img.shape[0],xpos : xpos + img.shape[1]]
    result[idx] = mean_blend(result[idx],img)
    # crop extra paddings
    x,h = cv.boundingRect(cv.cvtColor(result,cv.COLOR_RGB2GRAY))
    result = result[y : y + h,x : x + w]
    # return the resulting image with shift amount
    return (result,(xpos - x,ypos - y))

给定组合模式拼接多个图像

# base image is the last image in each iteration
def blend_multiple_images(images,homographies):
    N = len(images)
    assert(N >= 2)
    assert(len(homographies) == N - 1)
    pano = np.copy(images[0])
    pos = (0,0)
    for i in range(N - 1):
        img = images[i + 1]
        # get homography matrix
        H = homographies[i]
        # warp pano onto image
        pano,pos = warpPano(pano,pos)
    return (pano,pos)

上述方法将先前组合的图像（称为全景）扭曲到下一个图像上。然而，一个图案可能有连接点以获得最佳拼接视图。

例如

1 2 3
4 5 6

组合这些图像的最佳模式是

1 -> 2 <- 3
     |
     V
4 -> 5 <- 6

因此，我们需要最后一个函数在节点 1 & 2 处将 2 & 3 与 1235 或 456 与 5 组合。

from operator import sub

# no warping here,useful for combining two different stitched images
# the image at given origin coordinates must be the same
def patchPano(img1,img2,orig1=(0,0),orig2=(0,0)):
    # bottom right points
    br1 = (img1.shape[1] - 1,img1.shape[0] - 1)
    br2 = (img2.shape[1] - 1,img2.shape[0] - 1)
    # distance from orig to br
    diag2 = tuple(map(sub,br2,orig2))
    # possible pano corner coordinates based on img1
    extremum = np.array([(0,br1,tuple(map(sum,zip(orig1,diag2))),tuple(map(sub,orig1,orig2))])
    bb = cv.boundingRect(extremum)
    # patch img1 to img2
    pano,shift = addBorder(img1,bb)
    orig = tuple(map(sum,shift)))
    idx = np.s_[orig[1] : orig[1] + img2.shape[0] - orig2[1],orig[0] : orig[0] + img2.shape[1] - orig2[0]]
    subImg = img2[orig2[1] : img2.shape[0],orig2[0] : img2.shape[1]]
    pano[idx] = mean_blend(pano[idx],subImg)
    return (pano,orig)

要进行快速演示，您可以在 GitHub 中运行 the Python code。如果你想在C++中使用上面的方法，你可以看看Stitch library。欢迎对此帖子进行任何公关或编辑。