如何解决为什么我的教师强制推理功能不起作用?
我正在研究翻译模型。为此,我尝试使用注意力机制实现编码器-解码器模型。我想实施教师强制培训。
我做到了,这是我的模型:
encoder_input = Input(shape=(X_enc.shape[1]))
encoder_emb = Embedding(vocab_size_en,300,weights=[embedding_matrix],trainable=False,mask_zero=True)
encoder = encoder_emb(encoder_input)
encoder_lstm = LSTM(1024,return_state=True,return_sequences=True)
encoding_lstm,state_h,state_c = encoder_lstm(encoder)
encoder_states = [state_h,state_c]
decoder_input = Input(shape=(X_dec.shape[1]),name='decoder_input')
decoder_emb = Embedding(vocab_size_fr,mask_zero=True)
decoder = decoder_emb(decoder_input)
decoder_lstm = LSTM(1024,return_sequences=True,name='decoder_lstm')
decoding_lstm,_,_ = decoder_lstm(decoder,initial_state=[state_h,state_c])
attention = dot([decoding_lstm,encoding_lstm],axes=[2,2])
attention_layer = Activation('softmax')
attention = attention_layer(attention)
context = dot([attention,1])
decoder_combined_context = concatenate([context,decoding_lstm])
decoder_dense1 = Timedistributed(Dense(512,activation="tanh"))
dense1 = decoder_dense1(decoder_combined_context)
decoder_dense2 = Timedistributed(Dense(vocab_size_fr,activation="softmax"))
output = decoder_dense2(dense1)
model = Model(inputs=[encoder_input,decoder_input],outputs=[output])
但是,为了进行预测,我必须实现一个推理函数,因此,将我的编码器与我的解码器分开。这是我的尝试:
encoder = encoder_emb(encoder_input)
encoding_lstm,state_c = encoder_lstm(encoder)
encoding_results = Input(shape=(1024))
decoder_state_input_h = Input(shape=(1024))
decoder_state_input_c = Input(shape=(1024))
decoder_states_inputs = [decoder_state_input_h,decoder_state_input_c]
decoder_embedded = decoder_emb(decoder_input)
decoder_outputs,state_c = decoder_lstm(decoder_embedded,initial_state=decoder_states_inputs)
decoder_states = [state_h,state_c]
attention = dot([decoder_outputs,encoding_results],2])
attention = Activation('softmax')(attention)
context = dot([attention,decoder_outputs])
decoder_outputs = decoder_dense1(decoder_combined_context)
decoder_outputs = decoder_dense2(decoder_outputs)
encoder_model = Model(encoder_input,[encoding_lstm,state_c])
decoder_model = Model([decoder_inputs,encoding_results] + decoder_states_inputs,[decoder_outputs] + decoder_states)
这似乎有问题,因为它返回给我:
Dimensions must be equal,but are 4096 and 1024 for '{{node mul/mul}} = Mul[T=DT_FLOAT](Sigmoid_1,init_c)' with input shapes: [?,75,4096],[?,1024].
谁能给我一些想法来帮助我理解这个问题?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。