如何解决预期 ndim=3,发现 ndim=4在 keras 后端使用 K.function() 获取模型中的中间层时
我正在尝试提取对某些数据进行训练的分类模型的最后一层。第一层是 Embedding
层,然后是 bilstm
层,然后是输出密集层。我的代码在下面播种。我一直得到 4d 输出 (1,38,300,300)
而不是 3d (1,300)
。 1是样本大小,38是句子的最大长度,300是word2vec长度。
from keras import backend as K
from tensorflow.keras.models import load_model
import numpy as np
import gensim
word2vec = 'GoogleNews-vectors-negative300.txt'
x_matrix = np.zeros((1,300))
sentene_label = 'the weather today was extremely unpredictable,0'
parts = sentene_label.split(',')
label = int(parts[1])
sentence = parts[0]
words = sentence.split(' ')
words = words[:x_matrix.shape[1]]
for j,word in enumerate(words):
if word in word2vec:
# x_matrix[0,j,:] = word2vec[word]
x_matrix[0,:] = loaded_model.word_vec(word)
model = load_model('TrainedModel.h5')
get_3rd_layer_output = K.function([model.layers[0].input],[model.layers[2].output])
layer_output = get_3rd_layer_output(x_matrix)[0]
print("Layer Output Shape 1 : ",layer_output.shape)
我已经多次交叉检查我的代码,但我似乎无法弄清楚为什么尺寸是错误的。
这是回溯
Traceback (most recent call last):
File "/usr/pkg/lib/python3.8/site-packages/IPython/core/interactiveshell.py",line 3427,in run_code
exec(code_obj,self.user_global_ns,self.user_ns)
File "<ipython-input-2-bb840b495480>",line 1,in <module>
runfile('/am/vuwstocoisnrin1.vuw.ac.nz/ecrg-solar/kosimadukwe/Data Augmentation/test.py',wdir='/am/vuwstocoisnrin1.vuw.ac.nz/ecrg-solar/kosimadukwe/Data Augmentation')
File "/am/embassy/vol/x6/jetbrains/apps/PyCharm-P/ch-0/201.7846.77/plugins/python/helpers/pydev/_pydev_bundle/pydev_umd.py",line 197,in runfile
pydev_imports.execfile(filename,global_vars,local_vars) # execute the script
File "/am/embassy/vol/x6/jetbrains/apps/PyCharm-P/ch-0/201.7846.77/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py",line 18,in execfile
exec(compile(contents+"\n",file,'exec'),glob,loc)
File "/am/vuwstocoisnrin1.vuw.ac.nz/ecrg-solar/kosimadukwe/Data Augmentation/test.py",line 451,in <module>
layer_output = get_3rd_layer_output(x_matrix)[0]
File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/backend.py",line 4073,in func
outs = model(model_inputs)
File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py",line 1012,in __call__
outputs = call_fn(inputs,*args,**kwargs)
File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py",line 424,in call
return self._run_internal_graph(
File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py",line 560,in _run_internal_graph
outputs = node.layer(*args,**kwargs)
File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/layers/wrappers.py",line 539,in __call__
return super(Bidirectional,self).__call__(inputs,**kwargs)
File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py",line 998,in __call__
input_spec.assert_input_compatibility(self.input_spec,inputs,self.name)
File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/input_spec.py",line 219,in assert_input_compatibility
raise ValueError('Input ' + str(input_index) + ' of layer ' +
ValueError: Input 0 of layer bidirectional_9 is incompatible with the layer: expected ndim=3,found ndim=4. Full shape received: (1,300)
在线触发错误
layer_output = get_3rd_layer_output(x_matrix)[0]
调用get_3rd_layer_output之前x_matrix的形状是
The shape of X matrix : (60,300)
TrainedModels 架构
model = Sequential()
model.add(Embedding(vocab_size,input_length=38,weights=[embedding_matrix],trainable=True))
model.add(Bidirectional(LSTM(100,dropout=0.2)))
model.add(Dense(3,activation='sigmoid'))
model.compile(loss='sparse_categorical_crossentropy',optimizer='Adagrad',metrics=['accuracy'])
model.summary()
es = EarlyStopping(monitor='val_loss',mode='min',baseline=0.3,patience=100,verbose=1)
mc = ModelCheckpoint('TrainedModel.h5',monitor='val_loss',verbose=1,save_best_only=True)
hist = model.fit(train_sequences,train_y,epochs=200,verbose=False,batch_size=100,validation_data=(val_sequences,val_y),callbacks=[es,mc])
TrainedModels model.summary 是
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_9 (Embedding) (None,300) 7370400
_________________________________________________________________
bidirectional_9 (Bidirection (None,200) 320800
_________________________________________________________________
dense_9 (Dense) (None,3) 603
=================================================================
Total params: 7,691,803
Trainable params: 7,803
Non-trainable params: 0
_________________________________________________________________
解决方法
获得任何中间层输出的正确方法是创建一个子模型,该子模型期望与训练模型的输入相同。在您的情况下,错误增加是因为您将 3D 嵌入矩阵传递给训练模型,而您必须传递用于训练的相同数据(带有整数编码字的 2D 数据)。
这里我制作了一个虚拟示例,用于从模型中正确提取任何中间输出。
创建虚拟数据:
vocab_size = 111
emb_size = 300
input_length = 38
n_sample = 50
n_classes = 3
embedding_matrix = np.random.uniform(-1,1,(vocab_size,emb_size))
X = np.random.randint(0,vocab_size,(n_sample,input_length))
Y = np.random.randint(0,n_classes,))
创建模型并训练:
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras import backend as K
model = Sequential()
model.add(Embedding(vocab_size,emb_size,input_length=input_length,weights=[embedding_matrix],trainable=True))
model.add(Bidirectional(LSTM(100,dropout=0.2)))
model.add(Dense(n_classes,activation='sigmoid'))
model.compile(loss='sparse_categorical_crossentropy',optimizer='Adagrad',metrics=['accuracy'])
model.fit(X,Y,epochs=3) ### TRAINED WITH X
获取层输出:
layer_id = 2
get_layer_output = K.function([model.layers[0].input],[model.layers[layer_id].output])
layer_output = get_layer_output(X)[0] ### EXTRACT FROM X
# equal to:
# sub_model = Model(model.input,model.layers[layer_id].output)
# layer_output = sub_model.predict(X) ### EXTRACT FROM X
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。