微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

一起使用时如何替换令牌?

如何解决一起使用时如何替换令牌?

我想用python对主题COVID-19进行情感分析。出现的问题是,尽管“肯定测试”等条目是否定声明,但它们接收到肯定极性。我当前的代码如下:

import nltk
from textblob import TextBlob
from nltk.stem import WordNetLemmatizer

# Setting the test string
test_string = "He was tested positive on Covid-19"

tokens = nltk.word_tokenize(test_string)

# Lemmatizer
wordnet_lemmatizer = WordNetLemmatizer()

tokens_lem_list = []
for word in tokens:
    lem_tokens = wordnet_lemmatizer.lemmatize(word,pos="v")
    tokens_lem_list.append(lem_tokens)

# List to string
tokens_lem_str = ' '.join(tokens_lem_list)

# Print the polarity of the string
print(TextBlob(tokens_lem_str).sentiment.polarity)

具有以下输出

0.22727272727272727

Process finished with exit code 0

因此,如果要同时使用令牌,请删除“ test”和“ positive”标记,并用单词“ ill”代替它们。我应该使用循环吗?否则这只会占用大量文本的计算能力吗?

非常感谢您的帮助!

解决方法

我已经解决了以下问题:

# Producing a loop which finds "positive" and "negative" tested string entries
matches_positive = ["test","positive"]
matches_negative = ["test","negative"]

replaced_testing_term_sentence = []
for sentence_lem in sentences_list_lem:
    # Constrain to replace "positive tested" by "not healthy"
    if all(x in sentence_lem for x in matches_positive):
        sentence_lem = [word.replace("positive","not healthy") for word in sentence_lem]
        sentence_lem.remove("test")
        replaced_testing_term_sentence.append(sentence_lem)
    # Constrain to replace "negative tested" by "not ill"
    elif all(x in sentence_lem for x in matches_negative):
        sentence_lem = [word.replace("negative","not ill") for word in sentence_lem]
        sentence_lem.remove("test")
        replaced_testing_term_sentence.append(sentence_lem)
    # Constrain to remain not matching sentences in the data sample
    else:
        replaced_testing_term_sentence.append(sentence_lem)

完成工作。选定的替换术语是有意选择的。如果有人看到优化的潜力,我将不胜感激。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。