如何解决WordCloud 不会删除自定义停用词
我正在尝试添加要从我的词云中删除的停用词。似乎突然之间,我的额外停用词没有被添加。以前可以用。
我已将问题归结为此处显示的内容以及循环中的第一个词云。您可以在顶部看到“产品”这个词仍然存在,即使我将它添加到停用词列表中。其他两个停用词已正确删除。我将搭配设置为 False。
我尝试过 1.5.0 和 1.6.0 版
from wordcloud import WordCloud,STOPWORDS
import pandas as pd
import collections
import matplotlib.pyplot as plt
for i in range(20):
print(i)
wordcloud = WordCloud(stopwords=["product","and","the"],background_color='white',collocations=False).generate(clusterStrings[i])
# Display the generated image:
plt.imshow(wordcloud,interpolation='bilinear')
plt.axis("off")
plt.show()
解决方法
您正在每个循环中创建一个新实例,并且您正在替换而不是添加额外的停用词。尝试创建 wc 并将停用词添加到循环外的已知停用词
from wordcloud import WordCloud,STOPWORDS
import pandas as pd
import collections
import matplotlib.pyplot as plt
...
# create the instance only once and add stopwords
stopwords = set(STOPWORDS)
stopwords.add(["product","and","the"])
wordcloud = WordCloud(stopwords=stopwords,background_color='white',collocations=False)
for i in range(20):
print(i)
wordcloud.generate(clusterStrings[i])
# Display the generated image:
plt.imshow(wordcloud,interpolation='bilinear')
plt.axis("off")
plt.show()
,
试试这个功能: (参考)
from wordcloud import WordCloud,STOPWORDS
import matplotlib.pyplot as plt
def plot_wordcloud(text,mask=None,max_words=200,max_font_size=100,figure_size=(24.0,16.0),title = None,title_size=40,image_color=False):
stopwords = set(STOPWORDS)
more_stopwords = {'one','br','Po','th','sayi','fo','Unknown'}
stopwords = stopwords.union(more_stopwords)
wordcloud = WordCloud(background_color='black',stopwords = stopwords,max_words = max_words,max_font_size = max_font_size,random_state = 42,width=800,height=400,mask = mask)
wordcloud.generate(str(text))
plt.figure(figsize=figure_size)
if image_color:
image_colors = ImageColorGenerator(mask);
plt.imshow(wordcloud.recolor(color_func=image_colors),interpolation="bilinear");
plt.title(title,fontdict={'size': title_size,'verticalalignment': 'bottom'})
else:
plt.imshow(wordcloud);
plt.title(title,'color': 'black','verticalalignment': 'bottom'})
plt.axis('off');
plt.tight_layout()
plot_wordcloud(train_df["Col_name"],title="Word Cloud of ...")
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。