根据可用时间段/日历将人们分组的算法

如何解决根据可用时间段/日历将人们分组的算法

我生成了一个包含 1000 人的样本数据集，其中包含他们白天的可用时间段。每个时间段在白天有 30 分钟的间隔。 0 表示他们在该时间段内空闲，1 表示他们很忙。

例如：

| Time        | Sally | Mark | Nish |
| ------------| ----- | ---- | ---- |
| 0900 - 0930 |   0   |  1   |   1  |
| 0930 - 1000 |   1   |  0   |   1  |
| 1000 - 1030 |   1   |  1   |   1  |
| 1030 - 1100 |   1   |  0   |   1  |
| 1100 - 1130 |   0   |  1   |   1  |
| 1200 - 1230 |   1   |  0   |   0  |

我想创建最大数量的 5 人组，这些组至少有一个共同的可用时间段。每个组应该是互斥的。我想最大限度地增加创建的成功组的数量。

目前，我使用的是一种相当粗糙的算法。我对 5 个人的数据集进行采样，然后检查他们是否有共同的可用时间段。如果是这样，那么我将它们从数据集中删除并重复该过程。如果他们没有可用的公共时间段，我会重新采样另外 5 个人并继续尝试，直到找到具有公共时间段的 5 个样本。如果在 1000 次重采样后，它无法找到满足其停止条件的 5 个样本。

这对我来说似乎效率很低，我想知道是否有更好的方法来做到这一点。

解决方法

我会根据以下伪代码：

For every timeslot count the number of available people.
Sort the timeslot in ascending order of available people.

For every person,count the number of available timeslots.
Sort the people in ascending order of available timeslots.

While people are available:
    Enumerate the sorted list of timeslots:
        Repeat for the timeslot as long as 5+ people are available
            Collect five people available for this timeslot
            Remove them as one group from the list of people.
            Decrease the avaiblability count of the timeslot. 
    Break the while loop if no group was formed during the iteration

排序背后的基本原理是首先使用稀有时隙。首先分配选择很少的人。这让拥挤的时段和人们有很多选择进入最后一轮，从而增加更多小组的机会。

是的，您的蒙特卡罗解决方案非常耗时。根据可用插槽的密度，它可能存在致命缺陷。

相反，构建可行的集合。对于每个插槽，构建一组所有可用的人。现在，您需要做的就是遍历该集合的所有 5 成员子集。对每个时隙重复此操作。您现在拥有所有组的五个成员，并且至少有一个共同的时间段。