微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

从头开始使用PyTorch进行线性分类的问题

如何解决从头开始使用PyTorch进行线性分类的问题

我正在尝试在PyTorch中实现线性分类器,使用带有张量Wb的1层,softmax交叉熵损失。对于每一批我必须:

  1. 计算logits
  2. 使用softmax将logit转换为概率
  3. 计算最可能的类别
  4. 计算真实类别和预测类别之间的交叉熵
  5. 使用优化程序更改Wb

到目前为止,我已经拥有(装有Scikit-learn的平面MNIST):

# convert Numpy arrays to PyTorch tensor Variables
input_X_train = torch.from_numpy(X_train_flat).float().to(device)
input_X_val = torch.from_numpy(X_val_flat).float().to(device)
input_X_test = torch.from_numpy(X_test_flat).float().to(device)

input_y_train = torch.from_numpy(y_train).long().to(device)
input_y_val = torch.from_numpy(y_val).long().to(device)
input_y_test = torch.from_numpy(y_test).long().to(device)

# model parameters: W and b
W = torch.randn(input_dim,output_dim,device=device,dtype=dtype,requires_grad=True)
b = torch.randn(1,requires_grad=True)

BATCH_SIZE = 512
EPOCHS = 40
LEARNING_RATE = 1e-6

# create torch.optim.Adam optimizer for loss function minimization
optimizer = torch.optim.Adam([W,b],lr=LEARNING_RATE)

# create negative log loss function object for loss function evaluation
# use mean loss value from all batch samples
loss_fn = torch.nn.NLLLoss(reduction="mean")

for t in range(EPOCHS):    
    # logits for input_X,resulting shape should be [input_X.shape[0],10]
    logits = torch.matmul(input_X_train,W) + b

    # apply torch.nn.functional.softmax (torch_F.softmax) to logits
    probas = torch_f.softmax(logits,dim=1)
    
    # apply torch.argmax to find a class index with highest probability
    classes = torch.argmax(probas,dim=1)

    # loss should be a scalar number: average loss over all the objects with torch.mean()
    # PyTorch implements negative log loss (NLL) *without* log - you have to first compute log of 
    # softmax,then negative log loss,which will swap sign
    
    # Use torch.nn.functional.log_softmax (torch_f.log_softmax) on top of input_y and logits
    # It is identical to calculating cross-entropy (log and then NLL) on top of probas,# but is more numerically friendly (read the docs).
    log_probas = torch_f.log_softmax(logits,dim=1)
    loss = loss_fn(log_probas,input_y_train)

    # Before the backward pass,use the optimizer object to zero all of the
    # gradients for the variables it will update (which are the learnable
    # weights of the model). This is because by default,gradients are
    # accumulated in buffers( i.e,not overwritten) whenever .backward()
    # is called. Checkout docs of torch.autograd.backward for more details.
    optimizer.zero_grad()
    
    # calculate backward gradients for backpropagation
    loss.backward()
    
    # Calling the step function on an Optimizer makes an update to its parameters
    optimizer.step()

由于某些原因,Wb不变。我在做什么错了?

编辑: 我已经在e上面的代码中看到并尝试过。 G。这个最小的工作示例https://discuss.pytorch.org/t/minimal-working-example-of-optim-sgd/11623/2

编辑2: 渐变W.grad经常出现,我认为不应该那样。类的概率绝对是正确的(因此它不像this example),因为我已经检查了每一行的总和以及每个样本和的所有类的概率为1。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。