微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

从每个象限获得最近点的快速方法

如何解决从每个象限获得最近点的快速方法

我想快速(例如

每次搜索的输入:

  • 二维空间中的一组固定点(图像中的黑点)。这在搜索中保持不变,因此可以存储在有效的数据结构中。 1 到 1000 之间的点数。
  • 每次搜索的位置都不同的点(图像中的红点)。抽象地说,这将 2D 空间划分为 4 个象限(由图像中的红线分隔),就像笛卡尔坐标中的原点一样。

每次搜索输出

  • 来自每个红色象限中最接近红点的黑点(在图像中用蓝色圈出)。

Two searches of nearest neighbors per quadrant

输出通常应该是 4 个点,每个象限一个。但也有一些可能的边缘情况:

  • 一个没有黑点的象限;该象限不得分
  • 一个象限有多个联系;收集这个象限的所有这些点
  • 同一点位于象限的边界(红线)上:不要多次收集该点

我知道行不通的事情:

  • 仅沿一个轴的最近点,例如 x,可能非常远(图中绿色圆圈)
  • 一个象限中第二个最近的点(图中紫色圆圈)可能比其他象限中的点更近。对红点的 4 个最近邻点的简单搜索将收集到我不想要的这个点。

编辑:接受答案的代码版本和时间

def trial1(quadneighbori,printout = False,seedn = None,Npts = 1000,Niter = 1000):

    if seedn != None: np.random.seed(seedn) # random seed

    # Generate Npts (x,y) coordinates where x and y are standard normal
    dataset = np.random.randn(Npts,2)

    for n in range(Niter):
        # Generate random pixel (x,y) coordinates where x and y are standard normal
        red = np.random.randn(1,2)
        dst,i = quadneighbori(dataset,red)
        if printout: print(dst,i)

def quadneighbor1(dataset,red):
    dst = np.zeros(4)
    closest = np.zeros((4,2))

    # Work out a Boolean mask for the 4 quadrants
    right_exclu = dataset[:,0] > red[0,0]
    top_exclu =   dataset[:,1] > red[0,1]
    Q1 = np.logical_and( top_exclu,right_exclu)
    Q2 = np.logical_and(~top_exclu,right_exclu)
    Q3 = np.logical_and(~top_exclu,~right_exclu)
    Q4 = np.logical_and( top_exclu,~right_exclu)
    Qs = [Q1,Q2,Q3,Q4]

    for Qi in range(4):
        # Boolean mask to select points in this quadrant
        thisquad = dataset[Qs[Qi]]
        if len(thisquad)==0: continue # if no points,move on to next quadrant
        # Calculate distance of each point in dataset to red point
        distances = cdist(thisquad,red,'sqeuclidean')
        # Choose nearest
        idx = np.argmin(distances)
        dst[Qi] = distances[idx]
        closest[Qi] = thisquad[idx]

    return dst,closest

# numba turns 1.53s trial1 to 4.12ms trial1
@nb.jit(nopython=True,fastmath=True)
def quadneighbor2(dataset,red):
   redX,redY = red[0,0],red[0,1]
   # distance to and index of nearest point in each quadrant
   dst = np.zeros(4) + 1.0e308           # Start large! Update with something smaller later
   idx = np.zeros(4,dtype=np.uint32)
   for i in nb.prange(dataset.shape[0]):
      # Get this point's x,y
      x,y = dataset[i,dataset[i,1]
      # Get quadrant of this point (minus 1)
      if x>redX:
         if y>redY:
             Qi = 0
         else:
             Qi = 1
      else:
         if y>redY:
             Qi = 3
         else:
             Qi = 2

      # Get distance (squared) of this point - square root is slow and unnecessary
      d = (x-redX)*(x-redX) + (y-redY)*(y-redY)
      # Update if nearest
      if d<dst[Qi]:
         dst[Qi] = d
         idx[Qi] = i

   return dst,idx
%timeit trial1(quadneighbor1)
111 ms ± 3.21 ms per loop (mean ± std. dev. of 7 runs,10 loops each)

%timeit trial1(quadneighbor2)
4.12 ms ± 79.6 µs per loop (mean ± std. dev. of 7 runs,100 loops each)

PS:cdistnumba 不兼容,但调整该行以删除 cdist 并将 @nb.jit 添加quadneighbor1 可加快 trial1(quadneighbor1)从 111 毫秒到 30.1 毫秒。

解决方法

更新答案

我已将原始答案修改为在 numba 下运行。现在它可以在 3 微秒内处理 1,000 个点的数据集 - 与最初的 1 毫秒左右相比,速度要快得多。

#!/usr/bin/env python3

import random,sys
import numpy as np
import numba as nb

@nb.jit(nopython=True,fastmath=True)
def QuadNearestNeighbour(dataset,redX,redY):
   # Distance to and index of nearest point in each quadrant
   dst = np.zeros(4) + 1.0e308           # Start large! Update with something smaller later
   idx = np.zeros(4,dtype=np.uint32)
   for i in nb.prange(dataset.shape[0]):
      # Get this point's x,y
      x,y = dataset[i,0],dataset[i,1]
      # Get quadrant of this point (minus 1)
      if x>redX:
         if y>redY:
             Q = 0
         else:
             Q = 1
      else:
         if y>redY:
             Q = 3
         else:
             Q = 2

      # Get distance (squared) of this point - square root is slow and unnecessary
      d = (x-redX)*(x-redX) + (y-redY)*(y-redY)
      # Update if nearest
      if d<dst[Q]:
         dst[Q] = d
         idx[Q] = i

   return  dst,idx

# Number of points
Npts = 1000

# Generate the Npts random X,Y points
dataset = np.random.rand(Npts,2)

# Generate random red pixel (X,Y)
redX,redY = random.random(),random.random()

res = QuadNearestNeighbour(dataset,redY)
print(res)

原答案

对于 1,000 个点,这在 1 毫秒内运行 - 为红点在 1,000 多个不同位置计时。

这将生成一个包含 1,000 个点的单个数据集,然后生成 1,000 个红色中心并对每个中心进行 4 个象限。它在我的 MacBook 上运行不到一秒,所以每个红色中心为 1 毫秒。

我相信可以通过去掉无关的 print 语句来进一步优化它,而不是计算出仅用于说明的东西,也许使用 numba 但这足以一天。

我没有添加大量错误检查或边缘情况,这不是生产质量代码。

#!/usr/bin/env python3

import random
import numpy as np
from scipy.spatial.distance import cdist

# Let's get some repeatable randomness
np.random.seed(42)

# Number of points,number of iterations
Npts,Niter = 1000,1000

# Generate the Npts random X,2)

# Run lots of iterations for better timing
for n in range(Niter):

   # Generate random red pixel (X,Y)
   red = np.random.rand(1,2)

   # Work out a Boolean mask for the 4 quadrants
   above = dataset[:,0]>red[0,0]
   right = dataset[:,1]>red[0,1]
   Q1 = np.logical_and(above,right)       # top-right
   Q2 = np.logical_and(above,~right)      # top-left
   Q3 = np.logical_and(~above,~right)     # bottom-left
   Q4 = np.logical_and(~above,right)      # bottom-right

   Q = 1
   for quadrant in [Q1,Q2,Q3,Q4]:
       print(f'Quadrant {Q}: ')
       # Boolean mask to select points in this quadrant
       thisQuad = dataset[quadrant]
       l = len(thisQuad)
       if l==0:
           print('No points in quadrant')
           continue
       # print(f'nPoints in quadrant: {l}')
       # print(f'Points: {dataset[quadrant]}')
       # Calculate distance of each point in dataset to red point
       distances = cdist(thisQuad,red,'sqeuclidean')
       # Choose nearest
       idx = np.argmin(distances)
       print(f'Index: {idx},point: {thisQuad[idx]},distance: {distances[idx]}')
       Q += 1

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。