微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

强化学习 - PacMan 只运行一集

如何解决强化学习 - PacMan 只运行一集

我正在尝试基于 pacman 训练一个代理,但问题是它只能运行一集。由于一集包含三个生活,我添加了变量 dead 来检查该集是否已经结束。不幸的是,特工在一集后停止训练,但是,我无法真正澄清他这样做的原因。这是我的主函数代码,应该足够了:

def main():
    if __name__ =='__main__':
        env = gym.make('MsPacman-v0')
        #env = gym.make('FrozenLake-v0')
        
        
        state_size = (88,80,1)
        action_size = env.action_space.n
        episodes = 1000
        batch_size = 32
        skip_start = 90
        total_time = 0
        all_reward = 0
        blend = 4 # Number of images to blend
        done = False
        gamma = 0.99
        
        agent = Agent(state_size,action_size,gamma,epsilon = 1.0,epsilon_min = 0.1,epsilon_decay = 0.995,update_rate = 50)
        
        
        for e in range(episodes):
            total_reward = 0
            game_score = 0
            scores = []
            tot_reward = []
            tot_episodes = []
            #state = env.reset()
            state = process_frame(env.reset())
            images = deque(maxlen = blend)
            images.append(state)
            dead = False
            lives = 2
            #for skip in range(skip_start):
            #    env.step(0)
            
            
                
            while not done:
                dead = False
                
                while not dead:
                    
                    env.render()
                    total_time += 1
                    
                    if total_time % agent.update_rate == 0:
                        agent.update_target_model()
                    
                    state = blend_images(images,blend)
                    
                    action = agent.epsilon_greedy(state)
                    next_state,reward,done,info = env.step(action)
                    
                    
                    game_score += reward
                    
                    next_state = process_frame(next_state)
                    images.append(next_state)
                    next_state = blend_images(images,blend)
                    
                    agent.remember(state,action,next_state,done)
                    
                    state = next_state
                    
                    dead = info['ale.lives']<lives
                    lives = info['ale.lives']
                    
                    print("episode: {}/{},game score: {},avg reward: {}"
                          .format(e+1,episodes,game_score,all_reward/(e+1)))
                    print(dead)
                    total_reward += game_score if not dead else -100
            if done:
                scores.append(game_score)
                tot_reward.append(total_reward)
                tot_episodes.append(e)
                #all_reward += game_score
                #print(total_reward)
                                

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。