微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

神经网络序列分类的高损失

如何解决神经网络序列分类的高损失

我使用神经网络对长度为 340 到 8 个类别的序列进行分类,我使用交叉熵作为损失。我的损失数字非常高。我想知道我是否在计算每个时期的损失时犯了错误。或者我应该使用其他损失函数

criterion = nn.CrossEntropyLoss()
if CUDA:
    criterion = criterion.cuda()
optimizer = optim.SGD(model.parameters(),lr=LEARNING_RATE,momentum=0.9)
loss_list = []                                                                                                                      
for epoch in range(N_EPOCHES):                                                                                                          
    tot_loss=0                                                                                                                          
    running_loss =0                                                                                                                     
    model.train()                                                                                                                       
    loss_values = []                                                                                                                    
    acc_list = []                                                                                                                       
    acc_list = torch.FloatTensor(acc_list)                                                                                              
    sum_acc = 0                                                                                                                         
    # Training                                                                                                                                                                                                                                                        
    for i,(seq_batch,stat_batch) in enumerate(training_generator):                                                                    
        # Transfer to GPU                                                                                                               
        seq_batch,stat_batch = seq_batch.to(device),stat_batch.to(device)                                                             
        optimizer.zero_grad()                                                                                                           
        # Model computation                                                                                                             
        seq_batch = seq_batch.unsqueeze(-1)                                                                                                                                                                                                                          
        outputs = model(seq_batch)                                                                                                                                                                                                                                                                                                                                                            
        loss = criterion(outputs.argmax(1),stat_batch.argmax(1))                                                                                                                                                                                                                                           
        loss.backward()                                                                                                                 
        optimizer.step()                                                                                                                
        # print statistics                                                                                                              
        running_loss += loss.item()*seq_batch.size(0)                                                                                   
        loss_values.append(running_loss/len(training_set))                                                                                                                                                                                                                                                   
        if i % 2000 == 1999:  # print every 2000 mini-batches                                                                           
            print('[%d,%5d] loss: %.3f' %                                                                                              
                  (epoch + 1,i + 1,running_loss / 50000),"acc",(outputs.argmax(1) == stat_batch.argmax(1)).float().mean())            
            running_loss = 0.0                                                                                                          
        sum_acc += (outputs.argmax(1) == stat_batch.argmax(1)).float().sum()                                                            
    print("epoch",epoch,sum_acc/len(training_generator))                                                                                                                                                                                                                                                                                                                              
print('Finished Training')                                                                                                              
                                                                                                                                        
[1,2000] loss: 14.205 acc tensor(0.5312,device='cuda:0')
[1,4000] loss: 13.377 acc tensor(0.4922,6000] loss: 13.159 acc tensor(0.5508,8000] loss: 13.050 acc tensor(0.5547,10000] loss: 12.974 acc tensor(0.4883,device='cuda:0')
epoch 1 acc tensor(133.6352,device='cuda:0')
[2,2000] loss: 12.833 acc tensor(0.5781,4000] loss: 12.834 acc tensor(0.5391,6000] loss: 12.782 acc tensor(0.5195,8000] loss: 12.774 acc tensor(0.5508,10000] loss: 12.762 acc tensor(0.5156,device='cuda:0')
epoch 2 acc tensor(139.2496,device='cuda:0')
[3,2000] loss: 12.636 acc tensor(0.5469,4000] loss: 12.640 acc tensor(0.5469,6000] loss: 12.648 acc tensor(0.5508,8000] loss: 12.637 acc tensor(0.5586,10000] loss: 12.620 acc tensor(0.6016,device='cuda:0')
epoch 3 acc tensor(140.6962,device='cuda:0')
[4,2000] loss: 12.520 acc tensor(0.5547,4000] loss: 12.541 acc tensor(0.5664,6000] loss: 12.538 acc tensor(0.5430,8000] loss: 12.535 acc tensor(0.5547,10000] loss: 12.548 acc tensor(0.5820,device='cuda:0')
epoch 4 acc tensor(141.6522,device='cuda:0')

解决方法

我的损失非常高

是什么让你觉得这很高?你把它比作什么?

是的,您应该将 nn.CrossEntropyLoss 用于多类分类任务。你的训练损失对我来说似乎很好。在初始化时,您应该有 loss = -log(1/8) = ~2

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。