微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

用LSTM可视化自我关注权重以解决序列添加问题?

如何解决用LSTM可视化自我关注权重以解决序列添加问题?

我正在使用here中的Self Attention层来解决一个简单的问题,即以定界符之前的顺序添加所有数字。通过训练,我希望神经网络学习要添加的数字,并使用“自我注意”层,我希望可视化模型所关注的位置。重现结果的代码如下

import os
import sys

import matplotlib.pyplot as plt
import numpy
import numpy as np
from keract import get_activations
from tensorflow.keras import Sequential
from tensorflow.keras.callbacks import Callback
from tensorflow.keras.layers import Dense,Dropout,LSTM

from attention import Attention  # https://github.com/philipperemy/keras-attention-mechanism


def add_numbers_before_delimiter(n: int,seq_length: int,delimiter: float = 0.0,index_1: int = None) -> (np.array,np.array):
    """
    Task: Add all the numbers that come before the delimiter.
    x = [1,2,3,4,5,6,7,8,9]. Result is y =  6.
    @param n: number of samples in (x,y).
    @param seq_length: length of the sequence of x.
    @param delimiter: value of the delimiter. Default is 0.0
    @param index_1: index of the number that comes after the first 0.
    @return: returns two numpy.array x and y of shape (n,seq_length,1) and (n,1).
    """
    x = np.random.uniform(0,1,(n,seq_length))
    y = np.zeros(shape=(n,1))
    for i in range(len(x)):
        if index_1 is None:
            a = np.random.choice(range(1,len(x[i])),size=1,replace=False)
        else:
            a = index_1
        y[i] =  np.sum(x[i,0:a])
        x[i,a] = delimiter

    x = np.expand_dims(x,axis=-1)
    return x,y


def main():
    numpy.random.seed(7)

    # data. deFinition of the problem.
    seq_length = 20
    x_train,y_train = add_numbers_before_delimiter(20_000,seq_length)
    x_val,y_val = add_numbers_before_delimiter(4_000,seq_length)

    # just arbitrary values. it's for visual purposes. easy to see than random values.
    test_index_1 = 4
    x_test,_ = add_numbers_before_delimiter(10,test_index_1)
    # x_test_mask is just a mask that,if applied to x_test,would still contain the information to solve the problem.
    # we expect the attention map to look like this mask.
    x_test_mask = np.zeros_like(x_test[...,0])
    x_test_mask[:,test_index_1:test_index_1 + 1] = 1

    model = Sequential([
        LSTM(100,input_shape=(seq_length,1),return_sequences=True),SelfAttention(name='attention_weight'),Dropout(0.2),Dense(1,activation='linear')
    ])

    model.compile(loss='mse',optimizer='adam')
    print(model.summary())

    output_dir = 'task_add_two_numbers'
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    max_epoch = int(sys.argv[1]) if len(sys.argv) > 1 else 200

    class VisualiseAttentionMap(Callback):

        def on_epoch_end(self,epoch,logs=None):
            attention_map = get_activations(model,x_test,layer_names='attention_weight')['attention_weight']

            # top is attention map.
            # bottom is ground truth.
            plt.imshow(np.concatenate([attention_map,x_test_mask]),cmap='hot')

            iteration_no = str(epoch).zfill(3)
            plt.axis('off')
            plt.title(f'Iteration {iteration_no} / {max_epoch}')
            plt.savefig(f'{output_dir}/epoch_{iteration_no}.png')
            plt.close()
            plt.clf()

    model.fit(x_train,y_train,validation_data=(x_val,y_val),epochs=max_epoch,batch_size=64,callbacks=[VisualiseAttentionMap()])


if __name__ == '__main__':
    main()

但是,我得到以下结果注意权重

[请点击link] 1以查看训练期间的体重。

我希望注意力集中在定界符之前的所有值上。下面的白色代表地面真相,而上半部分代表10个样本的权重。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。