微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

ticactoe AI 玩的很烂! - 极小极大算法cs50 AI中的可能错误

如何解决ticactoe AI 玩的很烂! - 极小极大算法cs50 AI中的可能错误

我目前正在做一个 cs50 人工智能入门课程,我需要完成几个功能才能运行井字游戏。然而,在玩它时,AI 玩得很糟糕,通常在左上角选择方块,我很确定这与我的 minimax 功能有关。通过一些调试,它表明变量 foobar(尝试获得 min-value(result(s,a)) 的最高值以最大化玩家和最小化对手)不会改变并保持其原始值-无穷大和无穷大。但是我不明白为什么会发生这种情况。下面是代码,任何帮助都会很棒!

def minimax(board):
    """
    Returns the optimal action for the current player on the board.
    """
    #Checking if game is over
    if terminal(board):
        return None
    else:
        #Check whose turn it is
        turn = player(board)
        board_actions = actions(board)
        if turn == 'X':
            action_score_max = -math.inf
            return_value_min = board_actions[0]
            #return_value_max 
            for a in board_actions:
                foo = min_value(result(board,a))
                if foo > action_score_max:
                    action_score_max = foo
                    return_value_max = a
            
            return return_value_max

        else:
            action_score_min = math.inf
            return_value_min = board_actions[0]
            for a in board_actions:
                bar = max_value(result(board,a))
                if bar < action_score_min:
                    action_score_min = bar
                    return_value_min = a
            
            return return_value_min




def max_value(board):

    """
    Helper function for minimax (pick max value value of all routes)
    """

    v = -math.inf

    for action in actions(board):
        v = max(v,min_value(result(board,action)))
    
    return v



def min_value(board):

    """
    Helper function for minimax (pick min value value of all routes)
    """

    v = math.inf

    for action in actions(board):
        v = min(v,max_value(result(board,action)))

    return v

解决方法

正如对 minimax 函数的描述所暗示的那样,它的工作是返回当前玩家的最佳移动,为此您有 2 个辅助函数 max_valuemin_value,它们是那些你应该实现你的逻辑以便它获得并返回最佳移动的地方。

你可以这样做-

def minimax(board):
    """
    Returns the optimal action for the current player on the board.
    """
    if terminal(board):
        return None
        
    if player(board) == O:
        move = min_value(board)[1]
    else:
        move = max_value(board)[1]
    return move

def max_value(board):
    if terminal(board):
        return [utility(board),None]
    v = float('-inf')
    best_move = None
    for action in actions(board):
        hypothetical_value = min_value(result(board,action))[0]
        if hypothetical_value > v:
            v = hypothetical_value
            best_move = action
    return [v,best_move]


def min_value(board):
    if terminal(board):
        return [utility(board),None]
    v = float('inf')
    best_move = None
    for action in actions(board):
        hypothetical_value = max_value(result(board,action))[0]
        if hypothetical_value < v:
            v = hypothetical_value
            best_move = action
    return [v,best_move]

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。