Tensorflow 的 TripletSemiHardLoss 和 TripletHardLoss 是如何实现的，如何与 Siamese Network 一起使用？

如何解决Tensorflow 的 TripletSemiHardLoss 和 TripletHardLoss 是如何实现的，如何与 Siamese Network 一起使用？

据我所知，Triplet Loss 是一个损失函数，它减少了锚点和正数之间的距离，但减少了锚点和负数之间的距离。此外，还添加了一个边距。

例如，让我们假设：一个 Siamese Network，它提供嵌入：

anchor_output = [1,2,3,4,5...] # embedding given by the CNN model
positive_output = [1,4...]
negative_output= [53,43,33,23,13...]

而且我认为我可以获得三重损失，例如：（我认为我必须使用 Lambda 层左右将其作为损失）

# calculate triplet loss
d_pos = tf.reduce_sum(tf.square(anchor_output - positive_output),1)
d_neg = tf.reduce_sum(tf.square(anchor_output - negative_output),1)

loss = tf.maximum(0.,margin + d_pos - d_neg)
loss = tf.reduce_mean(loss)

那么到底是什么： tfa.losses.TripletHardLoss 和 tfa.losses.TripletSemiHardLoss

据我所知，Semi 和 hard 是 Siamese Techniques 的数据生成技术类型，可以推动模型了解更多信息。

我的想法：正如我在 This Post 中学到的，我认为你可以：

生成一批 3 张图片，并制作一对 3 张有 27 张图片
丢弃每个无效对（所有 i,j,k 都应该是唯一的）。剩余批次 B
批量获取每对的嵌入 B

所以我认为 HardTripletLoss 只考虑每批具有最大锚定距离和最低锚定距离的那 3 张图像。 >

对于 Semi Hard，我认为它会丢弃距离为 0 的每个图像对计算的所有损失。

如果没有，请有人纠正我并告诉我如何使用这些。（我知道我们可以在 model.complie() 中使用它，但我的问题是不同的。

解决方法

什么是`TripletHardLoss`？

这个损失遵循普通的TripletLoss形式，但在计算损失时使用最大正距离和最小负距离加上批次内的边际常数，如我们在公式中看到的：

查看 tfa.losses.TripletHardLoss 的 source code 我们可以看到上面的公式完全实现了：

# Build pairwise binary adjacency matrix.
adjacency = tf.math.equal(labels,tf.transpose(labels))
# Invert so we can select negatives only.
adjacency_not = tf.math.logical_not(adjacency)

adjacency_not = tf.cast(adjacency_not,dtype=tf.dtypes.float32)
# hard negatives: smallest D_an.
hard_negatives = _masked_minimum(pdist_matrix,adjacency_not)

batch_size = tf.size(labels)

adjacency = tf.cast(adjacency,dtype=tf.dtypes.float32)

mask_positives = tf.cast(adjacency,dtype=tf.dtypes.float32) - tf.linalg.diag(
    tf.ones([batch_size])
)

# hard positives: largest D_ap.
hard_positives = _masked_maximum(pdist_matrix,mask_positives)

if soft:
    triplet_loss = tf.math.log1p(tf.math.exp(hard_positives - hard_negatives))
else:
    triplet_loss = tf.maximum(hard_positives - hard_negatives + margin,0.0)

# Get final mean triplet loss
triplet_loss = tf.reduce_mean(triplet_loss)

注意soft中的tfa.losses.TripletHardLoss参数不是使用以下公式计算普通TripletLoss：

因为我们在上面的源代码中可以看到，它仍然使用最大正距离和最小负距离，所以决定是否使用软边距

什么是`TripletSemiHardLoss`？

这个损失也遵循普通的TripletLoss形式，正距离与普通TripletLoss相同，负距离使用semi-hard negative：

最小负距离，其中至少大于正距离加上边距常数，如果没有这样的负值存在，改为使用最大的负距离。

即我们首先要找到满足以下条件的负距离：

p 为正，n 为负，如果 wan 找不到满足此条件的负距离，则我们使用最大的负距离代替。

正如我们在tfa.losses.TripletSemiHardLoss的{{3}}中可以看到上面的条件过程清晰，其中negatives_outside是满足这个条件的距离，negatives_inside是最大的负距离：

# Build pairwise binary adjacency matrix.
adjacency = tf.math.equal(labels,tf.transpose(labels))
# Invert so we can select negatives only.
adjacency_not = tf.math.logical_not(adjacency)

batch_size = tf.size(labels)

# Compute the mask.
pdist_matrix_tile = tf.tile(pdist_matrix,[batch_size,1])
mask = tf.math.logical_and(
    tf.tile(adjacency_not,1]),tf.math.greater(
        pdist_matrix_tile,tf.reshape(tf.transpose(pdist_matrix),[-1,1])
    ),)
mask_final = tf.reshape(
    tf.math.greater(
        tf.math.reduce_sum(
            tf.cast(mask,dtype=tf.dtypes.float32),1,keepdims=True
        ),0.0,),batch_size],)
mask_final = tf.transpose(mask_final)

adjacency_not = tf.cast(adjacency_not,dtype=tf.dtypes.float32)
mask = tf.cast(mask,dtype=tf.dtypes.float32)

# negatives_outside: smallest D_an where D_an > D_ap.
negatives_outside = tf.reshape(
    _masked_minimum(pdist_matrix_tile,mask),batch_size]
)
negatives_outside = tf.transpose(negatives_outside)

# negatives_inside: largest D_an.
negatives_inside = tf.tile(
    _masked_maximum(pdist_matrix,adjacency_not),[1,batch_size]
)
semi_hard_negatives = tf.where(mask_final,negatives_outside,negatives_inside)

loss_mat = tf.math.add(margin,pdist_matrix - semi_hard_negatives)

mask_positives = tf.cast(adjacency,dtype=tf.dtypes.float32) - tf.linalg.diag(
    tf.ones([batch_size])
)

# In lifted-struct,the authors multiply 0.5 for upper triangular
#   in semihard,they take all positive pairs except the diagonal.
num_positives = tf.math.reduce_sum(mask_positives)

triplet_loss = tf.math.truediv(
    tf.math.reduce_sum(
        tf.math.maximum(tf.math.multiply(loss_mat,mask_positives),0.0)
    ),num_positives,)

如何使用这些损失？

两种损失都期望 y_true 被提供为具有多类整数标签的形状 [batch_size] 的一维整数 Tensor。并且嵌入 y_pred 必须是 l2 个归一化嵌入向量的二维浮点数 Tensor。

准备输入和标签的示例代码：

import tensorflow as tf
import tensorflow_addons as tfa
import tensorflow_datasets as tfds

def _normalize_img(img,label):
    img = tf.cast(img,tf.float32) / 255.
    return (img,label)

train_dataset,test_dataset = tfds.load(name="mnist",split=['train','test'],as_supervised=True)

# Build your input pipelines
train_dataset = train_dataset.shuffle(1024).batch(16)
train_dataset = train_dataset.map(_normalize_img)

# Take one batch of data
for data in train_dataset.take(1):
    print("Batch of images shape:\n{}\nBatch of labels:\n{}\n".format(data[0].shape,data[1]))

输出：

Batch of images shape:
(16,28,1)
Batch of labels:
[8 4 0 3 2 4 5 1 0 5 7 0 2 6 4 9]

如果您在使用时遇到问题，请遵循此source code。