创建具有多个值的参数的函数，例如csv中的行

如何解决创建具有多个值的参数的函数，例如csv中的行

所以我有两个功能。第一个使用字符串参数，并将其转换为spacy令牌。

def preprocess (texts):
   case = truecase.get_true_case(texts)
   doc = nlp(case)
   return doc

下一个调用该函数，并将文本处理为汇总字典。

def summarize_texts(texts):
    doc = preprocess(texts) #another function that took text and processed it as a spacy doc
    actions = {}
    entities = {}
    for token in doc:
        if token.pos_ == "VERB":
            actions[token.lemma_] = actions.get(token.text,0) +1
    for token in doc.ents:
         entities[token.label_] =  [token.text]
    return {
        'actions': actions,'entities': entities
    }

这样，当您调用该函数时，您将获得这些结果。

summarize_texts("Play it again,Sam")

output: {'actions': {'play': 1},'entities': {'PERSON': ['Sam']}}

我遇到的问题是我的函数只能使用一个参数，但是如果给它一个包含句子列表的参数，则该函数将失败：

[“ Billie Holiday玩点东西”， “将计时器设置为五分钟”， “再玩一次，山姆”]

我不确定如何使它按我想要的方式工作。

例如，如果我打电话

summarize_texts(["Play it again,Sam","Play something by Billie Holiday"])

output: {'actions': {'play': 2},'entities': {'PERSON': ['Sam','Billie']}}

但是如果我跑步

docs = [
    "Play something by Billie Holiday","Set a timer for five minutes","Play it again,Sam"
]
summarize_texts(docs)

output is:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-46-200347d5cac5> in <module>()
      4     "Play it again,Sam"
      5 ]
----> 6 summarize_texts(docs)

5 frames
/usr/local/lib/python3.6/dist-packages/nltk/tokenize/casual.py in _replace_html_entities(text,keep,remove_illegal,encoding)
    257         return "" if remove_illegal else match.group(0)
    258 
--> 259     return ENT_RE.sub(_convert_entity,_str_to_unicode(text,encoding))
    260 
    261 

TypeError: expected string or bytes-like object

解决方法

您可以检查输入的type！

我在这里检查它是str还是list！

之后，如果它是str，那么我将创建一个仅包含一个句子的列表！

您的输出将是结果列表！ [如果只有一个输入，您只需返回一个结果！ -可选]

return result[0] if len(result)==1 else result

def preprocess (texts):
   case = truecase.get_true_case(texts)
   doc = nlp(case)
   return doc

def summarize_texts(texts):
    if type(texts) is str: texts = [texts]
    result = []
    for text in texts:
        doc = preprocess(text) #another function that took text and processed it as a spacy doc
        actions = {}
        entities = {}
        for token in doc:
            if token.pos_ == "VERB":
                actions[token.lemma_] = actions.get(token.text,0) +1
        for token in doc.ents:
            entities[token.label_] =  [token.text]
        result.append({
            'actions': actions,'entities': entities
        })
    return result
print(summarize_texts("Play it again,Sam"))
print(summarize_texts(["Play something by Billie Holiday","Set a timer for five minutes","Play it again,Sam"]))