微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

第一步训练后的 Keras Nan 准确率和损失

如何解决第一步训练后的 Keras Nan 准确率和损失

我有一个关于时态数据的分类任务。从第一个 epoch 开始,我的训练损失为 0 或 Nan,准确率始终为 Nan,即使学习率非常小。

我的模型:

def FCN():
    """
    Keras fully convolutional model to predict lead inversion.

    Inspired by solution found here : https://github.com/Bsingstad/FYS-STK4155-oblig3
    """
    inputlayer = keras.layers.Input(shape=(N_MEASURES,N_LEADS))

    conv1 = keras.layers.Conv1D(filters=128,kernel_size=8,input_shape=(N_MEASURES,N_LEADS),padding='same')(inputlayer)
    # conv1 = keras.layers.Batchnormalization()(conv1)
    conv1 = keras.layers.Activation(activation='relu')(conv1)

    conv2 = keras.layers.Conv1D(filters=256,kernel_size=5,padding='same')(conv1)
    # conv2 = keras.layers.Batchnormalization()(conv2)
    conv2 = keras.layers.Activation('relu')(conv2)

    conv3 = keras.layers.Conv1D(128,kernel_size=3,padding='same')(conv2)
    # conv3 = keras.layers.Batchnormalization()(conv3)
    conv3 = keras.layers.Activation('relu')(conv3)

    gap_layer = keras.layers.GlobalAveragePooling1D()(conv3)

    outputlayer = tf.squeeze(keras.layers.Dense(1,activation='sigmoid')(gap_layer),axis=-1)


    model = keras.Model(inputs=inputlayer,outputs=outputlayer)

    model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),optimizer=tf.keras.optimizers.Adam(learning_rate=0.0000000000000000000001,clipnorm=1),metrics=[
                      tf.keras.metrics.BinaryAccuracy(name='accuracy',dtype=None,threshold=0.5),])

    return model

训练循环:

train_data_gen = ECGDataGenerator(train_input[train][0:4],train_output[train][0:4],batch_size=4,shuffle=True)
val_data_gen = train_data_gen

model = FCN()
for i,(x,y) in enumerate(train_data_gen):
    if i > 0:
        break
    y_pred = model.predict(x)
    print(x.shape)
    print(y)
    print(y_pred)
    print(y_pred.shape)
    loss = model.loss(y,y_pred)
    print(loss)

model.fit(x=train_data_gen,epochs=2,steps_per_epoch=2,# steps_per_epoch=train_data_gen.n_batches,validation_data=val_data_gen,verbose=1,validation_freq=1,#               callbacks=[reduce_lr,early_stop]
          )

for i,y) in enumerate(train_data_gen):
    if i > 10:
        break
    y_pred = model.predict(x)
    print(x.shape)
    print(y)
    print(y_pred)
    print(y_pred.shape)
    loss = model.loss(y,y_pred)
    print(loss)

输出如下:

(4,2500,12)
[0. 0. 0. 1.]
[0.50108045 0.5034382  0.4999477  0.5007813 ]
(4,)
tf.Tensor(0.6949963,shape=(),dtype=float32)
Epoch 1/2
2/2 [==============================] - 3s 794ms/step - loss: nan - accuracy: nan - val_loss: nan - val_accuracy: nan
Epoch 2/2
2/2 [==============================] - 0s 283ms/step - loss: 0.0000e+00 - accuracy: nan - val_loss: nan - val_accuracy: nan
(4,12)
[1. 0. 0. 1.]
[nan nan nan nan]
(4,)
tf.Tensor(nan,dtype=float32)

如您所见,经过一个训练步骤后,训练损失和准确度为 0 或 Nan,但如果我们在训练前手动计算损失,则损失不是 Nan。

这里的批量大小是 4。

我尝试过的事情:

  • 添加批量标准化无济于事。
  • 删除 GlobalAveragePooling1D 可以解决 Nan 问题,但会产生形状问题。
  • 降低/增加学习率也是如此。
  • 输入和输出不包含Nan值

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。