如何解决我不明白 Pytorch
我是神经网络的新手,所以如果我描述错误,我很抱歉。现在我正在尝试构建一个模型来使用 Pytorch 按字符生成文本。模型是 5 个字符的固定窗口/看起来我的模型还可以,但是在我计算损失函数并训练我的模型之后,我得到的结果只是一遍又一遍地重复相同的字母
èJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
%333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333
őJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
而且我期待一些类似句子的东西。 我还不太擅长张量维度,所以我想我在某处错误地转动了我的张量。
所以,我的模型是:
self.pad = nn.ZeroPad2d((5,0))
self.emb = nn.Embedding(n_tokens,emb_size) #max_norm=True
self.conv_1 = nn.Conv1d(emb_size,hid_size,kernel_size=6,stride=1,padding=0)
self.fc = nn.Linear(hid_size,n_tokens)
这里我把它转了两次,因为否则它不起作用:
input = self.pad(input)
input = self.emb(input)
input = input.permute(0,2,1)
input = self.conv_1(input)
input = input.permute(0,1)
input = self.fc(input)
作为输出,我得到 [batch_size,sequence_size,number_of_tokens] 形状的结果
当我计算损失时,我并不真正理解在进入损失函数之前我的张量维度必须是什么样子。所以现在它是这样的:
input = torch.as_tensor(input_ix,dtype=torch.int64)
logits = model(input)
reference_answers = input
mask = torch.from_numpy(compute_mask(input).to(torch.int32).cpu().numpy()) #mask is needed because I'm making same sequence length for each example.
criterion = nn.CrossEntropyLoss()
probs = nn.softmax(dim=1)
softmax_output = probs(logits)
mask_ = mask.unsqueeze(-1).expand(softmax_output.size())
softmax_masked = softmax_output * mask_
softmax_masked = softmax_masked.permute(0,1)
loss = criterion(softmax_masked,reference_answers)
return loss
我觉得我做的每件事都错了,损失惨重。我只是想了解它在现实生活中必须如何运作,因为我没有例子。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。