微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

无法将表情符号从社交媒体评论中转换为文本情感

如何解决无法将表情符号从社交媒体评论中转换为文本情感

我从 facebook 和 twitter 对产品广告的评论中收集了数据,并尝试对这些评论进行情绪分析。部分文本清理涉及将表情符号转换为文本情感,以最大限度地捕获评论中的所有情感。我已经尝试过 emoji.demojize(text) 每行和来自 stackoverflow 的各种其他方法,但没有一个评论中的表情符号转换为文字中的实际情绪。下面的代码不起作用。不知道我的错误是什么。代码如下:

enter import io
import json

def handleEmojis(text,keep_emoticons = False):
global emoji_sentiment_matching
if not 'emoji_sentiment_matching' in globals():
    with io.open('emoji.json','r',encoding = "UTF-8") as outfile:
        emoji_sentiment_matching = json.load(outfile)
HASHTAG_PATTERN = re.compile(r'#\w*')
EMOJIS_PATTERN_PLAIN_TEXT = re.compile(r"(?:X|:|;|=)(?:-)?(?:\)|\(|O|D|P|S){1,}",re.IGnorECASE)
EMOJIS_PATTERN_SYMBOLS = re.compile(u'[\U00002600-\U000027BF]|[\U0001f300-\U0001f64F]|[\U0001f680-\U0001f6FF]')

if keep_emoticons:
    # Replace emoji with sentiment
    for emoji in emoji_sentiment_matching:
        if emoji["emoji"] in text:

            ## Adding space if text follows right away / is right before the emoticon
            idx = text.find(emoji["emoji"])
            (space1,space2) = ("","")
            if (idx-1) >= 0 and text[idx-1] != " ":
                space1 = " "
            if (idx+1) <= len(text) and text[idx+1] != " ":
                space2 = " "

            ## replace emoticon with its sentiment
            text = text.replace(emoji["emoji"],"{}emoji%%{}{}".format(space1,emoji["subgroup"],space2))}
            

## TO IMPLEMENT: Sentiment of other emoticons like :),:-),:-/


else:
    for r in re.findall(EMOJIS_PATTERN_SYMBOLS,text):
        text = text.replace(r,"")
    for r in re.findall(EMOJIS_PATTERN_PLAIN_TEXT,"")
return text.strip()


import io
import json

FB_df['demojified']=FB_df['Text'] 
for i in range(len(FB_df)):
  text = FB_df.loc[i,"demojified"]
  handleEmojis(text,keep_emoticons = False)

print(FB_df)

这是结果输出(请参阅“demojified”列): dataframe outputs

我也尝试了以下代码

import re
from emot.emo_unicode import UNICODE_EMO,EMOTICONS
from emoji import demojize
def convert_emojis(text):
for emot in UNICODE_EMO:
    text = re.sub(r'('+emot+')',"_".join(UNICODE_EMO[emot].replace(",","").replace(":","").split()),text)
return text

将表情符号转换为文字

def convert_emoticons(text):
for emot in EMOTICONS:
    text = re.sub(u'('+emot+')',"_".join(EMOTICONS[emot].replace(",text)
    return text

FB_df['demojified']=FB_df['Text'] 

for row in FB_df['demojified']:
for text in row:
    text=text
    convert_emojis(text)

FB_df.loc[:,'demojified']

仍然没有快乐。我已经在这一个星期了。请提供一些指导,将不胜感激

我也试过:

I have also tried: 
import re
from emot.emo_unicode import UNICODE_EMO,text)
    return text
FB_df['demojified']=FB_df['Text'] 
for row in FB_df['demojified']:
 for text in row:
    text=str(text)
    text = emoji.demojize(text)

仍然没有快乐:-(

解决方法

发现问题。我忘记用for循环中的输出更新demojified列

FB_df['Demojified']=FB_df['Comments'] 
for i in range(len(FB_df.Demojified)):
  text = FB_df.loc[i,"Demojified"]
  text=emoji.demojize(text)
  text = text.replace(":"," ")
  text = ' '.join(text.split())
  FB_df.loc[i,"Demojified"] = text

FB_df=FB_df[['Title','Comments','Demojified']]

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。