如何解决是否可以将注意力权重与基于 CNN-LSTM 或基于 CNN-GRU 的网络中的输入直接关联以进行回归?
在基于 CNN-LSTM/GRU 的网络中,是否仍然可以将注意力层的权重直接与输入相关联,以可视化模型赋予输入的哪一部分权重更大?
在这里,我使用 CNN-GRU 根据 1 个输入时间序列(类似于教程 here)预测 2 个输出(相关)时间序列的下一个值。窗口大小 (n_steps) 是 80,为了使用 CNN,我们像这样重塑输入:
# choose a number of time steps
n_steps = 80
# split into samples
X,y = split_sequence(raw_seq,n_steps)
# reshape from [samples,timesteps] into [samples,subsequences,timesteps,features]
n_features = 1
n_seq = 40
n_steps_in_each_seq = 2
X = X.reshape((X.shape[0],n_seq,n_steps_in_each_seq,n_features))
这是我的模型定义:
ipt = Input(shape=(None,n_features ))
x = Timedistributed(Conv1D(filters=64,kernel_size=2,activation='relu'))(ipt)
x = Timedistributed(MaxPooling1D(pool_size=2))(x)
x = Timedistributed(Flatten())(x)
x = Bidirectional(GRU(600,activation='relu',return_sequences=True))(x)
att_layer,att_weights = SeqWeightedAttention(return_attention=True)(x)
out= Dense(2)(att_layer)
model = keras.models.Model(ipt,out)
model.compile(optimizer='adam',loss='mse',metrics=['acc'])
模型摘要如下:
Model: "model_1"
Layer (type) Output Shape Param #
input_1 (InputLayer) (None,None,40,1) 0
time_distributed_1 (Timedistributed) (None,39,64) 192
time_distributed_2 (Timedistributed) (None,19,64) 0
time_distributed_3 (Timedistributed) (None,1216) 0
bidirectional_1 (Bidirectional) (None,1200) 6541200
seq_weighted_attention_1 (SeqWeightedAttention) [(None,1200),(None,None)] 1201
dense_1 (Dense) (None,2) 2402
Total params: 6,544,995 Trainable params: 6,995 Non-trainable params: 0
我想将注意力权重与长度为80的输入窗口相关联。但是,注意力的维度不匹配。当我尝试获得注意力层的输出时,1)我没有分别获得注意力层的输出和权重,它们也没有与输入匹配的正确维度。我得到的权重为 (explained here)
outs = get_layer_outputs(model,'seq',X,1)
outs_1 = outs[0][0] # additional index since using batch_shape
outs_2 = outs[1][0]
我不确定我在这里做错了什么。希望有人能指出这一点?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。