准确度通过 Epochs 提高,但返回到评估的初始准确度

如何解决准确度通过 Epochs 提高,但返回到评估的初始准确度

我正在开展一个项目,以在 EEG 数据集上使用神经网络重建研究结果。在我从事该项目的整个过程中,我一直遇到重复的问题,即模型在整个 epoch 中都有一些精度改进,但对于评估,精度总是返回到初始值,特别是 1/NUM_CLASSES。其中 NUM_CLASSES 是分类​​类别的数量。老实说,我在这一点上被困住了,我认为模型过度拟合并试图调整我的数据预处理以进行补偿,但运气不佳。

代码如下:

# Filters out warnings
import warnings
warnings.filterwarnings("ignore")

# Imports,as of 3/10,all are necessary
import numpy as np
import tensorflow as tf
from keras import layers
from keras import backend as K
from keras.models import Model
from keras.optimizers import Adam,SGD
from keras.callbacks import Callback
from keras.layers import Conv3D,Input,Dense,Activation,BatchNormalization,Flatten,Add,Softmax
from sklearn.model_selection import StratifiedKFold

from DonghyunMBCNN import MultiBranchCNN

# Global Variables

# The directory of the process data,must have been converted and cropped,reference dataProcessing.py and crop.py
DATA_DIR = "../datasets/BCICIV_2a_cropped/"
# Which trial subject will be trained
SUBJECT = 1

# The number of classification categories,for motor imagery,there are 4
NUM_CLASSES = 4
# The number of timesteps in each input array
TIMESTEPS = 240
# The X-Dimension of the dataset
XDIM = 7
# The Y-Dimension of the dataset
YDIM = 6
# The delta loss requirement for lower training rate
LOSS_THRESHOLD = 0.01
# Initial learning rate for ADAM optimizer
INIT_LR = 0.01
# Define Which NLL (Negative Log Likelihood) Loss function to use,either "NLL1","NLL2",or "SCCE"
LOSS_FUNCTION = 'NLL2'
# Defines which optimizer is in use,either "ADAM" or "SGD"
OPTIMIZER = 'SGD'
# Whether training output should be given
VERBOSE = 1
# Determines whether K-Fold Cross Validation is used
USE_KFOLD = False
# Number of ksplit validation,must be atleast 2
KFOLD_NUM = 2
# Specifies which model structure will be used,'1' corresponds to the Create_Model function and '2' corresponds to Donghyun's model.
USE_STRUCTURE = '2'

# Number of epochs to train for
EPOCHS = 10

# Receptive field sizes
SRF_SIZE = (2,2,1)
MRF_SIZE = (2,3)
LRF_SIZE = (2,5)

# Strides for each receptive field
SRF_STRIDES = (2,1)
MRF_STRIDES = (2,2)
LRF_STRIDES = (2,4)

# This is meant to handle the reduction of the learning rate,current is not accurate,I have been unable to access the loss information from each Epoch
# The expectation is that if the delta loss is < threshold,learning rate *= 0.1. Threshold has not been set yet.
class LearningRateReducerCb(Callback):
    def __init__(self):
        self.history = {}
    def on_epoch_end(self,epoch,logs={}):

        for k,v in logs.items():
            self.history.setdefault(k,[]).append(v)

        fin_index = len(self.history['loss']) - 1
        if (fin_index >= 1):
            if (self.history['loss'][fin_index-1] - self.history['loss'][fin_index] > LOSS_THRESHOLD):
                old_lr = self.model.optimizer.lr.read_value()
                new_lr = old_lr*0.1
                print("\nEpoch: {}. Reducing Learning Rate from {} to {}".format(epoch,old_lr,new_lr))
                self.model.optimizer.lr.assign(new_lr)

# The Negative Log Likelihood function
def Loss_FN1(y_true,y_pred,sample_weight=None):
    return K.sum(K.binary_crossentropy(y_true,y_pred),axis=-1) # This is another loss function that I tried,was less effective

# Second NLL function,generally seems to work better
def Loss_FN2(y_true,sample_weight=None):
    n_dims = int(int(y_pred.shape[1])/2)
    mu = y_pred[:,0:n_dims]
    logsigma = y_pred[:,n_dims:]
    mse = -0.5*K.sum(K.square((y_true-mu)/K.exp(logsigma)),axis=1)
    sigma_trace = -K.sum(logsigma,axis=1)
    log2pi = -0.5*n_dims*np.log(2*np.pi)
    log_likelihood = mse+sigma_trace+log2pi
    return K.mean(-log_likelihood)


# Loads given data into two arrays,x and y,while also ensuring that all values are formatted as float32s
def load_data(data_dir,num):
    x = np.load(data_dir + "A0" + str(num) + "TD_cropped.npy").astype(np.float32)
    y = np.load(data_dir + "A0" + str(num) + "TK_cropped.npy").astype(np.float32)
    return x,y

def create_receptive_field(size,strides,model,name):
    modelRF = Conv3D(kernel_size = size,strides=strides,filters=32,padding='same',name=name+'1')(model)
    modelRF1 = BatchNormalization()(modelRF)
    modelRF2 = Activation('elu')(modelRF1)

    modelRF3 = Conv3D(kernel_size = size,filters=64,name=name+'2')(modelRF2)
    modelRF4 = BatchNormalization()(modelRF3)
    modelRF5 = Activation('elu')(modelRF4)

    modelRF6 = Flatten()(modelRF5)

    modelRF7 = Dense(32)(modelRF6)
    modelRF8 = BatchNormalization()(modelRF7)
    modelRF9 = Activation('relu')(modelRF8)

    modelRF10 = Dense(32)(modelRF9)
    modelRF11 = BatchNormalization()(modelRF10)
    modelRF12 = Activation('relu')(modelRF11)
    return Dense(NUM_CLASSES,activation='softmax')(modelRF12)

def Create_Model():
    # Model Creation

    model1 = Input(shape=(1,XDIM,YDIM,TIMESTEPS))

    # 1st Convolution Layer
    model1a = Conv3D(kernel_size = (3,3,5),strides = (2,4),filters=16,name="Conv1")(model1)
    model1b = BatchNormalization()(model1a)
    model1c = Activation('elu')(model1b)

    # Small Receptive Field (SRF)

    modelSRF = create_receptive_field(SRF_SIZE,SRF_STRIDES,model1c,'SRF')

    # Medium Receptive Field (MRF)

    modelMRF = create_receptive_field(MRF_SIZE,MRF_STRIDES,'MRF')

    # Large Receptive Field (LRF)

    modelLRF = create_receptive_field(LRF_SIZE,LRF_STRIDES,'LRF')

    # Add the layers - This sums each layer
    final = Add()([modelSRF,modelMRF,modelLRF])
    out = Softmax()(final)

    model = Model(inputs=model1,outputs=out)

    return model

if (LOSS_FUNCTION == 'NLL1'):
    loss_function = Loss_FN1
elif (LOSS_FUNCTION == 'NLL2'):
    loss_function = Loss_FN2
elif (LOSS_FUNCTION == 'SCCE'):
    loss_function = 'sparse_categorical_crossentropy'

# Optimizer is given as ADAM with an initial learning rate of 0.01
if (OPTIMIZER == 'ADAM'):
    opt = Adam(learning_rate = INIT_LR)
elif (OPTIMIZER == 'SGD'):
    opt = SGD(learning_rate = INIT_LR)

X,Y = load_data(DATA_DIR,SUBJECT)

if (USE_KFOLD):
    seed = 4
    kfold = StratifiedKFold(n_splits=KFOLD_NUM,shuffle=True,random_state=seed)
    cvscores = []

    for train,test in kfold.split(X,Y):
        if (USE_STRUCTURE == '1'):
            MRF_model = Create_Model()
        elif (USE_STRUCTURE == '2'):
            MRF_model = MultiBranchCNN(TIMESTEPS,NUM_CLASSES)
        # Compiling the model with the negative log likelihood loss function,ADAM optimizer
        MRF_model.compile(loss=loss_function,optimizer=opt,metrics=['accuracy'])

        # Training for 30 epochs
        MRF_model.fit(X[train],Y[train],epochs=30,verbose=VERBOSE)

        # Evaluating the effectiveness of the model
        scores = MRF_model.evaluate(X[test],Y[test],verbose=VERBOSE)
        print("%s: %.2f%%" % (MRF_model.metrics_names[1],scores[1]*100))
        cvscores.append(scores[1]*100)

    print("%.2f%% (+/- %.2f%%)" % (np.mean(cvscores),np.std(cvscores)))

else:
    if (USE_STRUCTURE == '1'):
        MRF_model = Create_Model()
    elif (USE_STRUCTURE == '2'):
        MRF_model = MultiBranchCNN(TIMESTEPS,NUM_CLASSES)

    MRF_model.compile(loss=loss_function,metrics=['accuracy'])

    MRF_model.fit(X,Y,epochs=EPOCHS,verbose=VERBOSE)

    _,acc = MRF_model.evaluate(X,verbose=VERBOSE)

    print("Accuracy: %.2f" % (acc*100))

数据来自 BCICIV 2A 数据集,包含 25 个通道。 3 个 EOG 通道被忽略,剩下 22 个通道。这 22 个通道被格式化为一个 7x6 - 0 填充阵列,以提供更具空间相关性的表示。我们使用滑动窗口方法来补偿一个小数据集,然后还在每次试验中运行通道平均 进一步处理数据。训练结果如下。

Epoch 1/10
666/666 [==============================] - 13s 17ms/step - loss: 4.0290 - accuracy: 0.3236
Epoch 2/10
666/666 [==============================] - 12s 18ms/step - loss: 3.9622 - accuracy: 0.3434
Epoch 3/10
666/666 [==============================] - 14s 21ms/step - loss: 3.9747 - accuracy: 0.3481
Epoch 4/10
666/666 [==============================] - 14s 21ms/step - loss: 3.9373 - accuracy: 0.3720
Epoch 5/10
666/666 [==============================] - 14s 21ms/step - loss: 3.9412 - accuracy: 0.3710
Epoch 6/10
666/666 [==============================] - 14s 21ms/step - loss: 3.9191 - accuracy: 0.3829
Epoch 7/10
666/666 [==============================] - 14s 21ms/step - loss: 3.9234 - accuracy: 0.3936
Epoch 8/10
666/666 [==============================] - 14s 21ms/step - loss: 3.8973 - accuracy: 0.3983
Epoch 9/10
666/666 [==============================] - 14s 21ms/step - loss: 3.8780 - accuracy: 0.4022
Epoch 10/10
666/666 [==============================] - 14s 21ms/step - loss: 3.8647 - accuracy: 0.3900
666/666 [==============================] - 5s 8ms/step - loss: 4.1935 - accuracy: 0.2500  
Accuracy: 25.00

不考虑准确率低,训练后准确率下降到 25.00 的事实令人担忧。我有一半觉得我遗漏了一些简单的东西,但一直无法解决问题。

欢迎任何建议或问题,非常感谢!

解决方法

我可以想到导致您观察到的差异的两个潜在原因,但我现在没有时间测试它们:

  1. 您正在使用的优化器 SGDAdam 都使用行的子集进行训练,而不是整个数据集。这会导致您观察到的不一致。
  2. BatchNorm 在训练时间和评估时间的工作方式不同。

两种情况都指向同一个方向:训练期间对准确率和损失的评估是对每批结果的汇总估计,在这种情况下过于乐观。

对于测试 1,您可以尝试将 batch_size 中的 fit 设置为 len(X)。请注意,您可能会耗尽内存,而且速度肯定会很慢(可能会非常慢)。

对于测试 2,您可以尝试删除 BatchNorm 步骤。

如果您按照这些思路工作,请通知我!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)&gt; insert overwrite table dwd_trade_cart_add_inc &gt; select data.id, &gt; data.user_id, &gt; data.course_id, &gt; date_format(
错误1 hive (edu)&gt; insert into huanhuan values(1,&#39;haoge&#39;); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive&gt; show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 &lt;configuration&gt; &lt;property&gt; &lt;name&gt;yarn.nodemanager.res