在列表理解中使用next

如何解决在列表理解中使用next

我正在尝试做一些非常简单的事情，而我可能过于复杂了：

这是问题所在

比方说，您生活在一个受控的经济环境中，那里有一个面包师傅，他每天都烤一定数量的面包。镇上的人排队买一条面包（你只能买一条面包）。

排队的人比面包多。队列中的每个人都会获得一张队列中号码的票证，以防止队列跳转，但是他们每一天都是相同的顺序（保持简单）。面包每天在不同的时间准备好，排队的一些人需要上班，如果面包在他们必须离开工作之前没有准备好，他们就离开队列，下一个排队的人代替他们。但是他们仍然有他们原来的排队票。原始列表中的值是队列中的人必须离开工作的小时数

我想知道面包师每天用尽面包之前，给他的最后一张票的号码是多少。

我可以使现有代码适用于相对较少的人群，但是如果有数百万人，很多天（计划经济计划在未来5年内进行规划），您就会明白。

def BakerQueue(loaves,people,bake_time):
    got_some_bread = []
    for b in bake_time:
        counter = 0
        for p in range(len(people)):
            if people[p] >= b:
                counter += 1
                if counter == loaves:
                    got_some_bread.append(p + 1)
                    counter = 0
                    break
                elif p == len(people) - 1:
                    got_some_bread.append(0)
                    break
            elif counter < loaves and p == len(people) - 1:
                got_some_bread.append(0)
                counter = 0
    return got_some_bread

您可以使用它来运行代码：在此示例中，列表中有3、18个人，一周中的每一天都有不同的烘焙时间，因此第一天的票证为1，2， 3条，第二天2,3,4条，第三天7、9和15条。我只在乎谁会每天得到最后一条面包，这就是函数返回的结果。

BakerQueue(3,[1,4,1,2,6,9,5,8],7])

这将按预期返回

[3,15,7,19]

本质上，我想确定列表的索引级别的优先级，并弹出大于另一个值的所有值

我有一个列表：my_list = [1,6]，我想保持它的索引优先级，因此我将索引和值都枚举到了一个新列表中：

my_list_of_tuples = [(i,j) for i,j in enumerate(my_list)]

这给了我：[(0,1),(1,4),(2,(3,3),(4,(5,2),(6,6)]

然后我将其转换为堆

heapq.heapify(my_list_of_tuples)

现在，我要检查堆顶部的值是否大于要迭代的单独列表中的迭代常数。如果是这样，我想从堆heapq.heappop(my_list_of_tuples)

中弹出它

我认为要执行此操作的代码如下，但是它不起作用，因此可能不起作用，但是我如何才能访问堆顶部的值，我想写这样的东西：

    counter = 0
    while counter <= static_constant:
        if next([v[1] for v in my_list_of_tuples]) < iterated_constant:
            heapq.heappop(my_list_of_tuples)
        else:
            counter += 1

希望获得有关如何处理列表推导生成器的帮助。谢谢

解决方法

我想我理解您的问题。

问题描述

给出：

num_items-可用项目数
targets-潜在目标的列表，每个目标都有一个值
threshold-截止限制

任务：

选择num_items的前targets个元素，其值大于或等于threshold。
从targets（从1开始）返回最后选择的元素的数组索引，如果没有足够的目标，则返回0。（奇怪的决定，我本来会选择从0开始的索引，如果找不到则返回len(targets)，但是很好）
优化速度。 targets和num_items每次都相同，threshold是唯一更改的值。

示例

num_items = 3
targets = [5,3,4,1,7,4]
threshold = 4

选择的目标将是位置[0,2,6]上的目标，其值为[5,7]，因为这些目标是头一个3值大于或等于threshold的目标。我们仅搜索最后一个的索引，在本例中为6。

方法

您最初的想法是遍历所有人员，如果阈值很低则非常快，但是如果阈值较高则变得非常慢，因为我们需要遍历所有人员直到找到候选人。 / p>

由于我无法理解您的代码，我重写了您的原始想法以遍历所有代码：

def choose_first_n(num_items,targets,threshold):
    for target_id,target in enumerate(targets):
        if target >= threshold:
            num_items -= 1
            if num_items == 0:
                return target_id + 1
    return 0

def baker_queue(num_loaves_per_day,people_max_waiting_time,required_baking_times):
    results = []
    for today_baking_time in required_baking_times:
        results.append(choose_first_n(num_loaves_per_day,today_baking_time))
    return results

print(baker_queue(3,[1,6,9,5,8],7]))
# Returns: [3,15,19],as in the original code.
# Also,please provide expected return values in future,like I did here.

使用堆是一个有趣的主意，但是我认为我们不会从中受益。堆只能真正快速地移除/插入项目，而我们不这样做。我们只是遍历它们。

我能想到的最快的方法是将threshold列表预处理为更有效的方法，就像创建最后一项的“索引”一样。

演示： 我们使用之前的代码，然后根据阈值查看结果：

def choose_first_n(num_items,target in enumerate(targets):
        if target >= threshold:
            num_items -= 1
            if num_items == 0:
                return target_id + 1
    return 0

targets = [1,8]
num_items = 3

for threshold in range (10):
    result = choose_first_n(num_items,threshold)
    print(f"Threshold: {threshold},Result: {result}")

Threshold: 0,Result: 3
Threshold: 1,Result: 3
Threshold: 2,Result: 4
Threshold: 3,Result: 4
Threshold: 4,Result: 7
Threshold: 5,Result: 15
Threshold: 6,Result: 15
Threshold: 7,Result: 19
Threshold: 8,Result: 19
Threshold: 9,Result: 0

您可以看到，如果阈值升高，则结果升高。阈值和结果之间存在线性稳定增长的关系。

如果我们可以计算结果更改时的值，则可以直接通过分治法搜索来计算结果，这比遍历列表快很多。（如果您熟悉Big-O表示法，可以使用O(logn)而不是O(n)

这里要注意的一件事是，最后的结果是0，这使该方案陷于瘫痪。这就是为什么让索引以0而不是1开头，并将“错误”情况设为len(targets)而不是0的原因。>

预处理

最困难的事情是对映射进行预处理。

让我们从另一个角度来看它。

为简单起见，假设num_items为3，我们有10个目标。所选目标是否会在前5个目标之内？

答案是：是的，如果前5个目标中至少有3个高于或等于阈值。换句话说，列表中的第三大数字是决定因素。如果阈值高于第三大数字，则选定的目标将不仅位于前五个目标之内。

因此，对于所有项目，我们需要计算第三大数字。有趣的是，实际上这是派上用场的地方;）

实施

import heapq
import bisect

def preprocess(targets,num_items):
    # our heap,will contain the first num_items smallest targets
    largest_targets_heap = []

    # Our first preprocessing result,will contain the
    # third large number between the first item and the current item,# for every item.
    third_largest_number_per_target = []

    # Compute the third largest previous value for every target
    for target in targets:
        heapq.heappush(largest_targets_heap,target)
        if len(largest_targets_heap) > num_items:
            heapq.heappop(largest_targets_heap)

        current_third_largest = largest_targets_heap[0]
        third_largest_number_per_target.append(current_third_largest)

    # We now have the third largest number for every target.
    # Now,consolidate that data into a lookup table,to prevent duplication.
    # Therefore,find the first occurrence of every number
    lookup_table_indices = []
    lookup_table_values = []
    current_value = third_largest_number_per_target[num_items - 1]

    # Push the (num_items-1)th value to account for the fact our heap wasn't filled up until the
    # first num_items were processed
    lookup_table_indices.append(num_items - 1)
    lookup_table_values.append(current_value)

    # Fill the rest of the lookup table
    for index,value in enumerate(third_largest_number_per_target):
        if index < num_items - 1:
            continue
        if value != current_value:
            lookup_table_indices.append(index)
            lookup_table_values.append(value)
            current_value = value

    # The lookup table we have,consisting of values,indices,a minimum and a maximum value
    lookup_table = (lookup_table_values,lookup_table_indices,num_items,len(targets))

    return lookup_table

def choose_first_n_preprocessed(lookup_table,threshold):
    (lookup_table_values,min_value,max_value) = lookup_table

    # We need to find the first (value,index) pair in lookup table where value is larger or equal to threshold
    # We do this by using bisect,which is really fast. This is only possible because of our preprocessing.
    position = bisect.bisect_left(lookup_table_values,threshold)

    # If we didn't find a result in the preprocessed table,we return the max value,to indicate that the
    # threshold ist too high.
    if position >= len(lookup_table_indices):
        return max_value

    # Read the result from the table of incides
    value = lookup_table_indices[position]
    return value

def baker_queue(num_loaves_per_day,required_baking_times):
    # Create the preprocessed lookup table
    lookup_table = preprocess(people_max_waiting_time,num_loaves_per_day)

    # For every day,compute the result
    results = []
    for today_baking_time in required_baking_times:
        # Use our fast lookup based algorithm now
        result = choose_first_n_preprocessed(lookup_table,today_baking_time)
        
        # Convert indices back to starting with 1,and 0 in error case,as
        # the original format was
        if result == len(people_max_waiting_time):
            results.append(0)
        else:
            results.append(result+1)
    return results

print(baker_queue(3,7]))
# [3,19]

理论分析

这现在应该快很多，尤其是对于很多天，但对于很多人来说。

天真的实现的复杂度是

O(days * people)

预处理实现的复杂度是

O(people * log(bread) + days * log(people))

这听起来没什么不同，但是确实如此。它基本上说如果限制因素是人，那么多少天都没有关系，如果限制因素是天，那么多少人也没关系。

基准化结果

设置为：

900面包
10,000人
10,000天

结果：

天真：2.13秒
预处理：0.012秒

然后我尝试将算法推到目前为止，这也需要2秒钟，并且得到了这些数字：

90,000面包
1,000,000人
1,000天

我没有在朴素的算法上运行这些数字，但是数学计算表明这将花费大约2,000秒，即23天。

花了一段时间，我希望这是值得的;）

我认为这是我迄今为止最大的帖子，这是一个非常有趣的任务！

我希望你对此表示赞赏。

问候