微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

根据字典搜索csv中的单词

如何解决根据字典搜索csv中的单词

我有一个字典csv,例如:

fruit   vegetable   meat
banana  broccoli    beef
apple   carrot      chicken
orange  corn        pork
mango   NaN         NaN
coconut NaN         NaN

和另一个csv,例如:

sentences
Today I ate some beef.
Corn is tasty.
I drank some coconut water.

我正在尝试将句子 csv 中的字符串与字典中的字符串进行匹配以进行分类

sentences                   food  
Today I ate some beef.      meat 
Corn is tasty.              vegetable
I drank some coconut water. fruit

我需要做什么来产生那个输出?我应该消除 NaN 以使其正常工作还是可以忽略它们?

解决方法

多种方式来做你想做的事。这是一个嵌套的 for 循环。我很确定您可以执行递归方法甚至列表推导式。

fruit = ["banana","apple",...]
meat = ["beef","chicken",...]
vegetable = ["corn","brocolli"] #Nan will simply be ignored
classes = [fruit,meat,vegtable]
#Now you have your 'classes' of strings.

sentences = ["Today I ate some beef.",...]#Here are your list of sentences.
output = []
for sentence in sentences: #for each sentence
   for food_type in classes: # we check if it exists in each class
       for food in food_type: # we check each food of each class
           if food in sentence: #if that string is in the sentence we pair it into a tuple
               output.append((sentence,food_type))
               

这仅在字符串准确时才匹配(大小写很重要)。还有一些警告可能会发生错误的类,例如您有“稻草”和“草莓”。

另请查看此 link 以“读取”您的 csv 到列表。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。