如何在 Tensorflow Keras API 中使用配对数据集样本创建联合损失?

如何解决如何在 Tensorflow Keras API 中使用配对数据集样本创建联合损失?

我正在尝试训练一个自动编码器,其约束强制一个或多个隐藏/编码节点/神经元具有可解释的值。我的训练方法使用成对图像(尽管在训练后模型应该对单个图像进行操作)并利用联合损失函数,其中包括(1)每个图像的重建损失和(2)隐藏/编码向量,来自两个图像中的每一个。

我创建了一个类似的简单玩具问题和模型来使这一点更清晰。在玩具问题中,给自编码器一个长度为 3 的向量作为输入。编码使用一个密集层来计算均值(标量)和另一个密集层来计算向量的其他一些表示(根据我的构造,它可能只会学习一个单位矩阵,即复制输入向量)。见下图。隐藏层的最低节点用于计算输入向量的均值。除了必须适应与输入匹配的重建之外,其余隐藏节点不受约束。

Toy model

下图展示了我希望如何使用配对图像训练模型。 “MSE”是均方误差,尽管实际函数的身份对于我在这里提出的问题并不重要。损失函数是重建损失和均值估计损失之和。

Toy model training

我尝试创建 (1) 一个 tf.data.Dataset 来生成配对向量,(2) 一个 Keras 模型,以及 (3) 一个自定义损失函数。但是,我无法理解如何在这种特殊情况下正确执行此操作。

我无法让 Model.fit() 正确运行,也无法按预期将模型输出与数据集目标相关联。请参阅下面的代码和错误。任何人都可以帮忙吗?我在 Google 和 stackoverflow 上进行了很多搜索,但仍然不明白如何实现这一点。

import tensorflow as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 

DTYPE = tf.dtypes.float32
N_VEC = 3

def my_generator(n):
    while True:
        # Create two identical vectors of length,except with different means.
        # An internal layer (single neuron) of the model should predict the
        # mean of the input vector. To train it to do so,with paired
        # vector inputs,use a loss function that penalizes incorrect
        # predictions of the difference of the means of two input vectors.
        input_vec1 = tf.random.normal((n,),dtype=DTYPE)
        target_mean_diff = tf.random.normal((1,dtype=DTYPE)
        input_vec2 = input_vec1 + target_mean_diff
        
        # Model is a constrained autoencoder. Output targets are
        # identical to the input vectors. Including them as explicit
        # targets in this generator,for generalization.
        target_vec1 = tf.identity(input_vec1)
        target_vec2 = tf.identity(input_vec2)
        
        yield ({'input_vec1':input_vec1,'input_vec2':input_vec2},{'target_vec1':target_vec1,'target_vec2':target_vec2,'target_mean_diff':target_mean_diff})

def my_dataset(n,batch_size=4):
    ds = tf.data.Dataset.from_generator(my_generator,output_signature=({'input_vec1':tf.TensorSpec(shape=(n,dtype=DTYPE),'input_vec2':tf.TensorSpec(shape=(n,dtype=DTYPE)},{'target_vec1':tf.TensorSpec(shape=(n,'target_vec2':tf.TensorSpec(shape=(n,'target_mean_diff':tf.TensorSpec(shape=(1,dtype=DTYPE)}),args=(n,))
    ds = ds.batch(batch_size)    
    return ds


## Do a brief test using the Dataset
ds = my_dataset(N_VEC,batch_size=4)
ds_iter = iter(ds)
dict_inputs,dict_targets = next(ds_iter)
print(dict_inputs)
print(dict_targets)


## Define the Model
layer_encode_vec = tf.keras.layers.Dense(N_VEC,activation=None,name='encode_vec')
layer_decode_vec = tf.keras.layers.Dense(N_VEC,name='decode_vec')
layer_encode_mean = tf.keras.layers.Dense(1,name='encode_mean')
layer_decode_mean = tf.keras.layers.Dense(N_VEC,name='decode_mean')

input1 = tf.keras.Input(shape=(N_VEC,name='input_vec1')
input2 = tf.keras.Input(shape=(N_VEC,name='input_vec2')
vec_encoded1 = layer_encode_vec(input1)
vec_encoded2 = layer_encode_vec(input2)
mean_encoded1 = layer_encode_mean(input1)
mean_encoded2 = layer_encode_mean(input2)
mean_diff = mean_encoded2 - mean_encoded1
pred_vec1 = layer_decode_vec(vec_encoded1) + layer_decode_mean(mean_encoded1)
pred_vec2 = layer_decode_vec(vec_encoded2) + layer_decode_mean(mean_encoded2)

model = tf.keras.Model(inputs=[input1,input2],outputs=[pred_vec1,pred_vec2,mean_diff])

print(model.summary())


## Define the joint loss function
def loss_total(y_true,y_pred):
    loss_reconstruct = tf.reduce_mean(tf.keras.MSE(y_true[0],y_pred[0]))/2 + \
                       tf.reduce_mean(tf.keras.MSE(y_true[1],y_pred[1]))/2
    loss_mean = tf.reduce_mean(tf.keras.MSE(y_true[2],y_pred[2]))
    return loss_reconstruct + loss_mean


## Compile model
optimizer = tf.keras.optimizers.Adam(lr=0.01)
model.compile(optimizer=optimizer,loss=loss_total)


## Train model
history = model.fit(x=ds,epochs=10,steps_per_epoch=10)

输出:来自数据集的示例批次:

{'input_vec1': <tf.Tensor: shape=(4,3),dtype=float32,numpy=
array([[-0.53022575,-0.02389329,0.32843253],[-0.61793506,-0.8276422,-1.3469328 ],[-0.5401968,0.3141346,-1.3638284 ],[-1.2189807,0.23848908,0.75108534]],dtype=float32)>,'input_vec2': <tf.Tensor: shape=(4,numpy=
array([[-0.23415083,0.27218163,0.6245074 ],[-0.57636774,-0.7860749,-1.3053654 ],[ 0.65463066,1.508962,-0.16900098],[-0.49326736,0.9642024,1.4767987 ]],dtype=float32)>}
{'target_vec1': <tf.Tensor: shape=(4,'target_vec2': <tf.Tensor: shape=(4,'target_mean_diff': <tf.Tensor: shape=(4,1),numpy=
array([[0.29607493],[0.04156734],[1.1948274 ],[0.7257133 ]],dtype=float32)>}

输出:模型摘要:

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_vec1 (InputLayer)         [(None,3)]          0                                            
__________________________________________________________________________________________________
input_vec2 (InputLayer)         [(None,3)]          0                                            
__________________________________________________________________________________________________
encode_vec (Dense)              (None,3)            12          input_vec1[0][0]                 
                                                                 input_vec2[0][0]                 
__________________________________________________________________________________________________
encode_mean (Dense)             (None,1)            4           input_vec1[0][0]                 
                                                                 input_vec2[0][0]                 
__________________________________________________________________________________________________
decode_vec (Dense)              (None,3)            12          encode_vec[0][0]                 
                                                                 encode_vec[1][0]                 
__________________________________________________________________________________________________
decode_mean (Dense)             (None,3)            6           encode_mean[0][0]                
                                                                 encode_mean[1][0]                
__________________________________________________________________________________________________
tf.__operators__.add (TFOpLambd (None,3)            0           decode_vec[0][0]                 
                                                                 decode_mean[0][0]                
__________________________________________________________________________________________________
tf.__operators__.add_1 (TFOpLam (None,3)            0           decode_vec[1][0]                 
                                                                 decode_mean[1][0]                
__________________________________________________________________________________________________
tf.math.subtract (TFOpLambda)   (None,1)            0           encode_mean[1][0]                
                                                                 encode_mean[0][0]                
==================================================================================================
Total params: 34
Trainable params: 34
Non-trainable params: 0
__________________________________________________________________________________________________

输出:调用model.fit()时的错误信息:

Epoch 1/10
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

...

ValueError: Found unexpected keys that do not correspond to any
Model output: dict_keys(['target_vec1','target_vec2','target_mean_diff']).
Expected: ['tf.__operators__.add','tf.__operators__.add_1','tf.math.subtract']

解决方法

您可以像这样将 dictModelinputs 传递给 outputs

model = tf.keras.Model(
    inputs={"input_vec1": input1,"input_vec2": input2},outputs={
        "target_vec1": pred_vec1,"target_vec2": pred_vec2,"target_mean_diff": mean_diff,},)

这避免了必须命名输出层。

对于损失,它目前将 loss_total 分别应用于 3 个输出中的每一个并求和以获得最终损失,这不是您想要的。因此,您可以单独列出每个损失:

model.compile(
    optimizer=optimizer,loss={"target_vec1": "mse","target_vec2": "mse","target_mean_diff": "mse"},loss_weights={"target_vec1": 0.5,"target_vec2": 0.5,"target_mean_diff": 1},)

或者您可以使用采用 dict 输入的修改后的损失函数手动训练模型。类似的东西:

def loss_total(y_true,y_pred):
    loss_reconstruct = (
        tf.reduce_mean(tf.keras.losses.MSE(y_true["target_vec1"],y_pred["target_vec1"])) / 2
        + tf.reduce_mean(tf.keras.losses.MSE(y_true["target_vec2"],y_pred["target_vec2"])) / 2
    )
    loss_mean = tf.reduce_mean(tf.keras.losses.MSE(y_true["target_mean_diff"],y_pred["target_mean_diff"]))
    return loss_reconstruct + loss_mean

for epoch in range(10):
    for batch,(x,y) in zip(range(10),ds):
        with tf.GradientTape() as tape:
            outputs = model(x,training=True)
            loss = loss_total(y,outputs)

        trainable_vars = model.trainable_variables
        gradients = tape.gradient(loss,trainable_vars)
        optimizer.apply_gradients(zip(gradients,trainable_vars))
        print(f"Batch: {batch},loss: {loss.numpy()}")

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)&gt; insert overwrite table dwd_trade_cart_add_inc &gt; select data.id, &gt; data.user_id, &gt; data.course_id, &gt; date_format(
错误1 hive (edu)&gt; insert into huanhuan values(1,&#39;haoge&#39;); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive&gt; show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 &lt;configuration&gt; &lt;property&gt; &lt;name&gt;yarn.nodemanager.res