如何使用 Triplet Loss 构建一个连体网络，其输入是来自 GRU + Resnet 输出的加法/连接向量

如何解决如何使用 Triplet Loss 构建一个连体网络，其输入是来自 GRU + Resnet 输出的加法/连接向量

所以我已经有一个有效的 Siamese network 需要 3 个输入。 Anchor Branch Takes -> Image、Positive Branch -> Text 和 Negative Branch -> Text。 我将它们传递给我的模型并像下面给出的网络一样轻松地训练网络：

def model(vocab_size,lr=0.0001):
    input_1 = Input(shape=(None,None,3)) # Anchor takes image 
    input_2 = Input(shape=(None,)) # Positive takes text
    input_3 = Input(shape=(None,)) # Negative takes text

    base_model = resnet50(weights='imagenet',include_top=False)
    x1 = base_model(input_1)
    x1 = GlobalMaxPool2D()(x1)
    dense_1 = Dense(vec_dim,activation="linear",name="dense_image_1")
    x1 = dense_1(x1)


    embed = Embedding(vocab_size,50,name="embed")
    gru = Bidirectional(GRU(256,return_sequences=True),name="gru_1")
    dense_2 = Dense(vec_dim,name="dense_text_1")
    x2 = embed(input_2)
    x2 = SpatialDropout1D(0.1)(x2)
    x2 = gru(x2)
    x2 = GlobalMaxPool1D()(x2)
    x2 = dense_2(x2)

    x3 = embed(input_3)
    x3 = SpatialDropout1D(0.1)(x3)
    x3 = gru(x3)
    x3 = GlobalMaxPool1D()(x3)
    x3 = dense_2(x3)

    _norm = Lambda(lambda x: K.l2_normalize(x,axis=-1)) # normalize here

    x1 = _norm(x1)
    x2 = _norm(x2)
    x3 = _norm(x3)


    model = Model([input_1,input_2,input_3],[x1,x2,x3]) # Loss function handles the 3 outputs 
    model.compile(loss=triplet_loss,optimizer=Adam(lr)) # triplet_loss handles the multi output
    return model

现在我想尝试一下发生这种情况的地方：

1 个文本和 1 个图像转到 GRU 和 resnet 尊重并生成向量。我们连接/添加这些向量，然后我们将那个向量传递给 Siamese Network 的 Anchor，所有分支都会发生同样的情况。

解决方法

不确定是否需要，但如果你想做这样的事情：

...anchor 前的“额外网络”可以按照以下原则制作：

import tensorflow as tf
from tensorflow import keras
from keras.layers import *
import keras.backend as K
from tensorflow.keras.applications.resnet50 import ResNet50
import numpy as np

#Just for demonstrating...
vocab_size=10
lr=0.0001
vec_dim=2

# Let's make an extra network having ability to take one image and one text in:
# (applying the same ResNet50 structure as a base in image analysis...remember to think would you like to have the same or separate base model)

#need to put in one extra image and one extra text ...
one_image=np.random.random((64,64,3))
one_text='experience in ai'

input_for_one_image=Input(shape=(None,None,3),name='extra image')
input_for_one_text=Input(shape=(None,),name='extra text')

base_model = ResNet50(weights='imagenet',include_top=False)
base_model_extra=keras.models.Model(inputs=base_model.input,outputs=base_model.output,name='resnet50_extra')

dense_1 = Dense(vec_dim,activation="linear",name="dense_image_1")
embed = Embedding(vocab_size,50,name="embed")
gru = Bidirectional(GRU(256,return_sequences=True),name="gru_1")
dense_2 = Dense(vec_dim,name="dense_text_1")

x1_extra = base_model_extra(input_for_one_image)
x1_extra = GlobalMaxPool2D()(x1_extra)
dense_1_extra = Dense(vec_dim,name="dense_image_1_extra")
x1_extra = dense_1_extra(x1_extra)

embed_extra = Embedding(vocab_size,name="embed_extra")
gru_extra = Bidirectional(GRU(256,name="gru_1_extra")
dense_2_extra = Dense(vec_dim,name="dense_text_1_extra")
x2_extra = embed_extra(input_for_one_text)
x2_extra = SpatialDropout1D(0.1)(x2_extra)
x2_extra = gru(x2_extra)
x2_extra = GlobalMaxPool1D()(x2_extra)
x2_extra = dense_2(x2_extra)

#Concatenate
temp=Concatenate()([x1_extra,x2_extra])

extra_network=keras.models.Model(inputs=[input_for_one_image,input_for_one_text],outputs=temp)

tf.keras.utils.plot_model(extra_network,to_file='extra.png')

...但如果应用它，请记住相应地适合锚点，以便能够接受“严格非图像格式”的锚点。