如何解决OpenPose 如何通过输出形状和真实值不匹配来实现其损失函数?
我最近一直在实现基于 OpenPose 的模型。在 OpenPose 中,它使用 VGG 作为其主干模型来提取特征图,但 VGG 包含最大池化层,这会将输出的形状减少到 1/4。这是 OpenPose 的模型结构:
VGGOpenPose(
(model0): OpenPose_Feature(
(model): Sequential(
(0): Conv2d(3,64,kernel_size=(3,3),stride=(1,1),padding=(1,1))
(1): ReLU(inplace=True)
(2): Conv2d(64,1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2,stride=2,padding=0,dilation=1,ceil_mode=False)
(5): Conv2d(64,128,1))
(6): ReLU(inplace=True)
(7): Conv2d(128,1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2,ceil_mode=False)
(10): Conv2d(128,256,1))
(11): ReLU(inplace=True)
(12): Conv2d(256,1))
(13): ReLU(inplace=True)
(14): Conv2d(256,1))
(15): ReLU(inplace=True)
(16): Conv2d(256,1))
(17): ReLU(inplace=True)
(18): MaxPool2d(kernel_size=2,ceil_mode=False)
(19): Conv2d(256,512,1))
(20): ReLU(inplace=True)
(21): Conv2d(512,1))
(22): ReLU(inplace=True)
(23): Conv2d(512,1))
(24): ReLU(inplace=True)
(25): Conv2d(256,1))
(26): ReLU(inplace=True)
)
)
(model1_1): Sequential(
(0): Conv2d(128,1))
(1): ReLU(inplace=True)
(2): Conv2d(128,1))
(3): ReLU(inplace=True)
(4): Conv2d(128,1))
(5): ReLU(inplace=True)
(6): Conv2d(128,kernel_size=(1,1))
(7): ReLU(inplace=True)
(8): Conv2d(512,38,1))
)
(model2_1): Sequential(
(0): Conv2d(185,kernel_size=(7,7),padding=(3,3))
(1): ReLU(inplace=True)
(2): Conv2d(128,3))
(3): ReLU(inplace=True)
(4): Conv2d(128,3))
(5): ReLU(inplace=True)
(6): Conv2d(128,3))
(7): ReLU(inplace=True)
(8): Conv2d(128,3))
(9): ReLU(inplace=True)
(10): Conv2d(128,1))
(11): ReLU(inplace=True)
(12): Conv2d(128,1))
)
(model3_1): Sequential(
(0): Conv2d(185,1))
)
(model4_1): Sequential(
(0): Conv2d(185,1))
)
(model5_1): Sequential(
(0): Conv2d(185,1))
)
(model6_1): Sequential(
(0): Conv2d(185,1))
)
(model1_2): Sequential(
(0): Conv2d(128,19,1))
)
(model2_2): Sequential(
(0): Conv2d(185,1))
)
(model3_2): Sequential(
(0): Conv2d(185,1))
)
(model4_2): Sequential(
(0): Conv2d(185,1))
)
(model5_2): Sequential(
(0): Conv2d(185,1))
)
(model6_2): Sequential(
(0): Conv2d(185,1))
)
)
在原始论文中,它说groundtruth heatmap和paf与输入图像的宽度和高度相同。 OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
我在 Python 中搜索了一些 OpenPose 的实现。他们大多使用element-wise loss函数来计算output和groundtruth label之间的loss,和论文中提到的函数一样:
我想知道 OpenPose 的输出是否与输入图像的大小不同,OpenPose 是如何计算输出和 groundtruth heatmap/paf 之间的损失函数的?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。