如何解决如何在不获取“函数调用堆栈:修剪”的情况下为标记化字符串生成 ELMo 嵌入?
我正在尝试为成批的标记化字符串生成 ELMo 嵌入。但是我不断收到以下错误:
Traceback (most recent call last):
File "/home/lorcan/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py",line 3326,in run_code
exec(code_obj,self.user_global_ns,self.user_ns)
File "<ipython-input-2-0d50a997dad6>",line 17,in <module>
embeddings = elmo(tokens=tokens2,sequence_len=lens2)['elmo']
File "/home/lorcan/anaconda3/envs/ncr_elmo/lib/python3.6/site-packages/tensorflow/python/eager/function.py",line 1605,in __call__
return self._call_impl(args,kwargs)
File "/home/lorcan/anaconda3/envs/ncr_elmo/lib/python3.6/site-packages/tensorflow/python/eager/function.py",line 1645,in _call_impl
return self._call_flat(args,self.captured_inputs,cancellation_manager)
File "/home/lorcan/anaconda3/envs/ncr_elmo/lib/python3.6/site-packages/tensorflow/python/eager/function.py",line 1746,in _call_flat
ctx,args,cancellation_manager=cancellation_manager))
File "/home/lorcan/anaconda3/envs/ncr_elmo/lib/python3.6/site-packages/tensorflow/python/eager/function.py",line 598,in call
ctx=ctx)
File "/home/lorcan/anaconda3/envs/ncr_elmo/lib/python3.6/site-packages/tensorflow/python/eager/execute.py",line 60,in quick_execute
inputs,attrs,num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [4,5,1] vs. [4,9,1024]
[[node mul (defined at /home/lorcan/anaconda3/envs/ncr_elmo/lib/python3.6/site-packages/tensorflow_hub/module_v2.py:106) ]] [Op:__inference_pruned_4853]
Function call stack:
pruned
这里出了什么问题?嵌入张量是不是太大了?我正在使用 Python 3.6.13
tensorflow==2.2.0
、tensorflow-estimator==2.2.0
和 tensorflow-hub==0.12.0
。
import tensorflow as tf
import tensorflow_hub as hub
elmo = hub.load('https://tfhub.dev/google/elmo/3').signatures['tokens']
tokens = tf.convert_to_tensor(
[[b'fetal',b'derived',b'definitive',b'erythrocyte',b'',b''],[b'splenic',b'red',b'pulp',b'macrophage',[b'juxtaglomerular',b'complex',b'cell',[b'epithelial',b'of',b'large',b'intestine',b'']],tf.string)
lens = tf.convert_to_tensor([4,4,3,5],tf.int32)
embeddings = elmo(tokens=tokens,sequence_len=lens)['elmo']
解决方法
当 tokens
中的尾随空格被删除使得至少一个条目不以 b''
结尾时,它对我有用,即
tokens = tf.convert_to_tensor(
[[b'fetal',b'derived',b'definitive',b'erythrocyte',b''],[b'splenic',b'red',b'pulp',b'macrophage',[b'juxtaglomerular',b'complex',b'cell',b'',[b'epithelial',b'of',b'large',b'intestine']],tf.string)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。