将numpy数组解析为单个值的改进建议

如何解决将numpy数组解析为单个值的改进建议

我编写了一个函数，用于根据选定的方法将2D numpy整数n X m数组聚合为1 X 1 2D numpy数组。如何改善我的功能以提高速度/性能？

方法是：

min：返回最小值
max：返回最大值
中位数：返回最常出现的值。
优先级值：如果指定的priority值在数组中的出现超过阈值th，则返回指定的值。

其他要求：

如果输入值中的值都相同，则返回该数字
可以提供一个ignore值，该值被方法掩盖了，但不是上面的要求。

我当前的实现方式：

import numpy as np

def array2val(arr,method,dt,prio=None,th=None,ignore=None):
    """
    Parse a Numpy array to a single output value based on method. Useful for aggregation
    :param arr: 2D numpy array
    :param method: [sum,min,max,median,priority]. priority means to give priority to a value if it occurs >= a threshold
    :param dt: datatype of output array
    :param prio: the value to be prioritized if method == priority
    :param th: occurrence treshold for the priority value. Return median if threshold is not exceeded
    :param ignore: value to ignore in all methods
    :return: 2D numpy array with shape (1,1) with value following above,unless the input array has all same values,then return that value. This trumps ignore values
    """

    # All values are the same,return this value
    if arr.std() == 0:
        return np.array([[arr[0,0]]]).astype(dt)

    # Mask away ignored values if requested
    if ignore is not None:
        arr = np.ma.array(arr,mask=np.where(arr == ignore,True,False))
        v,c = np.unique(arr,return_counts=True)
        vals = v.data[~v.mask]  # Values with ignore value removed
        counts = c[~v.mask]     # Counts with ignore value removed
    else:
        vals,counts = np.unique(arr,return_counts=True)

    if method == 'median':
        out = vals[counts.argmax()]
        return np.array([[out]]).astype(dt)

    elif method == 'priority':
        if counts[np.where(vals == prio)] >= th:  # priority value is in the array and exceeds treshold
            return np.array([[prio]]).astype(dt)
        else:  # priority value does not exceed treshold or is not in the array at all.
            out = vals[counts.argmax()]  # default to most occuring value
            return np.array([[out]]).astype(dt)

    elif method == 'sum':
        return np.array([[arr.sum()]]).astype(dt)

    elif method == 'min':
        return np.array([[arr.min()]]).astype(dt)

    elif method == 'max':
        return np.array([[arr.max()]]).astype(dt)

    else:
        raise Exception('Invalid method for aggregation')

将numpy数组解析为单个值的改进建议

如何解决将numpy数组解析为单个值的改进建议

相关推荐