如何解决神经网络无法为单个输入融合
我正在尝试复制neural network for shape and appearance disentangling。网络是用Tensorflow编写的,我想用PyTorch编写。该模型如下所示:
class Model(nn.Module):
def __init__(self,parts=16,n_features=32):
super(Model,self).__init__()
self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
self.E_sigma = E(3,parts,residual_dim=64,sigma=True)
self.E_alpha = E(1,n_features,sigma=False)
self.decoder = Decoder(parts,n_features)
def forward(self,x):
sig,stack = self.E_sigma(x)
f_xs = self.E_alpha(stack)
alpha = get_local_part_appearances(f_xs,sig)
mu,L_inv = get_mu_and_prec(sig,self.device)
encoding = feat_mu_to_enc(alpha,mu,L_inv,self.device)
reconstruction = self.decoder(encoding)
return reconstruction
该模型由三个nn.modules
组成,分别为E_sigma
,E_alpha
和Decoder
。例如,它看起来如下:
class E(nn.Module):
def __init__(self,depth,n_out,residual_dim,sigma=True):
super(E,self).__init__()
self.sigma = sigma
self.hg = Hourglass(depth,residual_dim) # depth 4 has bottleneck of 4x4
self.n_out = Conv(residual_dim,kernel_size=1,stride=1,bn=True,relu=True)
if self.sigma:
self.preprocess_1 = Conv(3,64,kernel_size=6,stride=2,relu=True) # transform to 64 x 64 for sigma
self.preprocess_2 = Residual(64,residual_dim)
self.map_transform = Conv(n_out,1,1) # channels for addition must be increased
def forward(self,x):
if self.sigma:
x = self.preprocess_1(x)
x = self.preprocess_2(x)
out = self.hg(x)
map = self.n_out(out)
if self.sigma:
map_normalized = F.softmax(map.reshape(map.size(0),map.size(1),-1),dim=2).view_as(map)
map_transform = self.map_transform(map_normalized)
stack = map_transform + x # Why not stack? x is much larger than map_transform,so it is almost no impact
return map_normalized,stack
else:
return map
还有三个函数可以转换数据。我正在尝试测试,如果仅输入一个输入并让模型训练一段时间,模型是否会收敛:
def train():
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
net = Model().to(device)
net.train()
optimizer = torch.optim.Adam(net.parameters(),lr=1e-4)
criterion = nn.MSELoss().to(device)
img = torch.randn(1,3,128,128).to(device)
for epoch in range(1000):
optimizer.zero_grad()
prediction = net(img)
loss = criterion(prediction,img)
loss.backward()
optimizer.step()
if epoch % 10 == 0:
print(loss)
该模型具有大约1200万个参数,不幸的是,对于该单个输入,损耗始终收敛在大约0.5。这是否表明我的体系结构存在问题?可能是什么原因?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。