微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何通过以下代码进行优化?它在 Kaggle 中显示内存错误

如何解决如何通过以下代码进行优化?它在 Kaggle 中显示内存错误

我正在做股票情绪分析的自然语言处理 (NLP) 项目。

我有一个名为“headlines_train”的列表,长度为 3975。

headlines_train[1] 看起来像这样:(很长的句子)

'scorecard the best lake scene leader  german sleaze inquiry cheerio  boyo the main recommendations has cubie killed fees  has cubie killed fees  has cubie killed fees  hopkins  furIoUs  at foster s lack of hannibal appetite has cubie killed fees  a tale of two tails i say what i like and i like what i say elbows  eyes and nipples task force to assess risk of asteroid collision how i found myself at last on the critical list the timing of their lives dear doctor irish court halts ira man s exTradition to northern ireland burundi peace initiative fades after rebels reject mandela as mediator pe points the way forward to the ecb campaigners keep up pressure on nazi war crimes SUSPECT jane ratcliffe yet more things you wouldn t kNow without the movies millennium bug fails to bite'

所以我的“headlines_train”列表中有 3975 个这样长的句子。

现在我使用以下代码对这个列表应用词形还原但是当我运行它时,我遇到了内存问题。我该如何优化它??

from nltk.stem import WordNetLemmatizer
from nltk.corpus import stopwords
wordnet = WordNetLemmatizer()
headline_train_new=[]
for i in range(len(headlines_train)):
    review = [wordnet.lemmatize(word) for word in headlines_train[i] if not word in set(stopwords.words('english'))]
    review = ' '.join(headline_train_new)
    headline_train_new.append(review)

内存错误

Your notebook tried to allocate more memory than is available. It has restarted.

请帮忙!

提前致谢

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。