如何解决NLP神经网络的递归深度超出/内核死亡级联张量
更新
我设法通过创建一个既包含要素(评论)又包含标签(总体评分)的列表来解决该问题,然后使用映射/应用(如果使用熊猫数据框)将其转换为张量。那时,我使用了Tensorflow的from_tensor_slices方法来准备对特征/标签进行训练。
原始问题
我目前正在从事一个NLP项目,以帮助学习Python / Tensorflow。我的程序接受评论,对其进行编码,将其转换为张量和张量数据集,然后将其输入到神经网络中。我遇到的问题是“ RecursionError:调用Python对象时超出了最大递归深度”,这是由于将张量连接到单个张量数据集中而引起的。
当我尝试从数据集中访问元素时(通过迭代对象或通过训练网络),将出现递归错误。
我所做的事情:
如果我将处理的评论总数从最初的9000 ish减少到1500,则效果很好。
如果我使用
import sys
sys.setrecursionlimit(10000)
然后juypter内核死亡,而不是给我递归错误。
相关代码(我认为)
#encode the text
encoded_reviews=[]
for j in trimmed_review:
encoded_reviews.append(encoder.encode(j))
#creating tensorflow datasets for training
def labeler(review,rating):
return review,rating
#pairing the labels (good/bad game) with the encoded reviews
encoded_review_rating_list=[]
for i,j in enumerate(encoded_reviews):
encoded_review_dataset = tf.data.Dataset.from_tensors(tf.cast(j,dtype='int64'))
encoded_review_rating_list.append(encoded_review_dataset.map(lambda x: labeler(x,ratings[i])))
#Combine the list of review:score sets into a single tensor dataset.
encoded_review_ratings = encoded_review_rating_list[0]
#test_var_tensor=tf.constant()
for single_dataset in encoded_review_rating_list[1:]:
encoded_review_ratings=encoded_review_ratings.concatenate(single_dataset)
#Shuffle the datasets to avoid any biases.
buffer_size = len(encoded_reviews)
all_labeled_data = encoded_review_ratings.shuffle(
buffer_size,reshuffle_each_iteration=False)
##Split the encoded words into training and test datasets,take size amount of data that goes into the training set
training_ratio=0.6
take_size= round(len(encoded_reviews)*training_ratio)
batch_size=30
#Organizing our training and validation data,the padded shapes are set to the longest review (as specified by None keywords)
train_data = all_labeled_data.take(take_size)
train_data = train_data.padded_batch(batch_size,padded_shapes=((None,),(1,)))
test_data = all_labeled_data.skip(take_size)
test_data = test_data.padded_batch(batch_size,)))
next_feature,next_label = next(iter(test_data))
print (next_feature,next_label)
```---------------------------------------------------------------------------
RecursionError Traceback (most recent call last)
<ipython-input-8-e941c005ed79> in <module>
----> 1 next_feature,next_label = next(iter(test_data))
2
3 print (next_feature,next_label)
~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py in __iter__(self)
416 if (context.executing_eagerly()
417 or ops.get_default_graph()._building_function): # pylint: disable=protected-access
--> 418 return iterator_ops.OwnedIterator(self)
419 else:
420 raise RuntimeError("__iter__() is only supported inside of tf.function "
~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in __init__(self,dataset,components,element_spec)
592 context.context().device_spec.device_type != "cpu"):
593 with ops.device("/cpu:0"):
--> 594 self._create_iterator(dataset)
595 else:
596 self._create_iterator(dataset)
~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in _create_iterator(self,dataset)
598 def _create_iterator(self,dataset):
599 # pylint: disable=protected-access
--> 600 dataset = dataset._apply_options()
601
602 # Store dataset reference to ensure that dataset is alive when this iterator
~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py in _apply_options(self)
356
357 dataset = self
--> 358 options = self.options()
359 if options.experimental_threading is not None:
360 t_options = options.experimental_threading
~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py in options(self)
347 options = Options()
348 for input_dataset in self._inputs():
--> 349 input_options = input_dataset.options()
350 if input_options is not None:
351 options = options.merge(input_options)
... last 1 frames repeated,from the frame below ...
~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py in options(self)
347 options = Options()
348 for input_dataset in self._inputs():
--> 349 input_options = input_dataset.options()
350 if input_options is not None:
351 options = options.merge(input_options)
RecursionError: maximum recursion depth exceeded while calling a Python object
解决方法
为社区的利益在答案部分提供解决方案。谢谢@Accommodator的更新。
我设法通过创建一个包含两者的列表来解决该问题 功能(评论)和标签(总体评分),然后使用映射 /应用(如果使用熊猫数据框)以将其转换为 张量。那时,我使用了from_tensor_slices方法 tensorflow使特征/标签准备好进行训练
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。