微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

将层插入到 Keras 模型中后,Logits 和标签维度错误

如何解决将层插入到 Keras 模型中后,Logits 和标签维度错误

我构建了以下非常简单的模型:

inp = tf.keras.layers.Input((32,32,3))
x = tf.keras.layers.Conv2D(filters=1,kernel_size=3,strides=2,padding='same')(inp)
x = tf.nn.relu(x)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(units=10,activation='linear')(x)
outp = x
model = tf.keras.models.Model(inp,outp)

我想在 ReLU 层之后插入一个 dropout 层,所以我遵循了 this post 的答案中描述的方法

这是我的代码

import re
from keras.models import Model

def insert_layer_nonseq(model,layer_regex,insert_layer_factory,insert_layer_name=None,position='after'):

    # Auxiliary dictionary to describe the network graph
    network_dict = {'input_layers_of': {},'new_output_tensor_of': {}}

    # Set the input layers of each layer
    for layer in model.layers:
        for node in layer._outbound_nodes:
            layer_name = node.outbound_layer.name
            if layer_name not in network_dict['input_layers_of']:
                network_dict['input_layers_of'].update(
                        {layer_name: [layer.name]})
            else:
                network_dict['input_layers_of'][layer_name].append(layer.name)

    # Set the output tensor of the input layer
    network_dict['new_output_tensor_of'].update(
            {model.layers[0].name: model.input})

    # Iterate over all layers after the input
    model_outputs = []
    count=0
    for layer in model.layers[1:]:
        count+=1

        # Determine input tensors
        layer_input = [network_dict['new_output_tensor_of'][layer_aux] 
                for layer_aux in network_dict['input_layers_of'][layer.name]]
        if len(layer_input) == 1:
            layer_input = layer_input[0]

        # Insert layer if name matches the regular expression
        if re.match(layer_regex,layer.name):
            if position == 'replace':
                x = layer_input
            elif position == 'after':
                x = layer(layer_input)
            elif position == 'before':
                pass
            else:
                raise ValueError('position must be: before,after or replace')

            new_layer = insert_layer_factory()
            x = new_layer(x)
            print('New layer: {} Old layer: {} Type: {}'.format(new_layer.name,layer.name,position))
            if position == 'before':
                x = layer(x)
        else:
            x = layer(layer_input)

        # Set new output tensor (the original one,or the one of the inserted
        # layer)
        network_dict['new_output_tensor_of'].update({layer.name: x})

        # Save tensor in output list if it is output in initial model
        if layer_name in model.output_names:
            model_outputs.append(x)

    return Model(inputs=model.inputs,outputs=model_outputs)



clone_model = tf.keras.models.clone_model(model)

def dropout_layer_factory():
    return tf.keras.layers.Dropout(rate=0.2,name='dropout')
nm = insert_layer_nonseq(clone_model,'.*relu.*',dropout_layer_factory)

# Fix possible problems with new model
nm.save('temp.h5')
nm = load_model('temp.h5')

以下是所得模型的摘要 (nm):

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_19 (InputLayer)        [(None,3)]       0         
_________________________________________________________________
conv2d_25 (Conv2D)           (None,16,1)         28        
_________________________________________________________________
tf.nn.relu_25 (TFOpLambda)   (None,1)         0         
_________________________________________________________________
dropout (Dropout)            (None,1)         0         
_________________________________________________________________
global_average_pooling2d_25  (None,1)                 0         
_________________________________________________________________
dense_25 (Dense)             (None,10)                20        
=================================================================
Total params: 48
Trainable params: 48
Non-trainable params: 0
_________________________________________________________________

在我看来,一切看起来都很棒。但是,当我尝试训练模型时,出现以下错误

InvalidArgumentError:  logits and labels must have the same first dimension,got logits shape [8192,1] and labels shape [32]
     [[node sparse_categorical_crossentropy/SparsesoftmaxCrossEntropyWithLogits/SparsesoftmaxCrossEntropyWithLogits (defined at <ipython-input-123-9e6a1b98c0a1>:7) ]] [Op:__inference_train_function_111449]

与大多数 logits/labels 错误不同,损失函数不是这里的问题。当我用完全相同的代码训练原始模型时,它工作得很好。不知何故,插入 dropout 层会引入一个错误,使新模型无法训练。

有没有人了解为什么会发生这种情况?谢谢!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。