如何解决创建具有多个值的参数的函数,例如csv中的行
所以我有两个功能。第一个使用字符串参数,并将其转换为spacy令牌。
def preprocess (texts):
case = truecase.get_true_case(texts)
doc = nlp(case)
return doc
def summarize_texts(texts):
doc = preprocess(texts) #another function that took text and processed it as a spacy doc
actions = {}
entities = {}
for token in doc:
if token.pos_ == "VERB":
actions[token.lemma_] = actions.get(token.text,0) +1
for token in doc.ents:
entities[token.label_] = [token.text]
return {
'actions': actions,'entities': entities
}
summarize_texts("Play it again,Sam")
output: {'actions': {'play': 1},'entities': {'PERSON': ['Sam']}}
我遇到的问题是我的函数只能使用一个参数,但是如果给它一个包含句子列表的参数,则该函数将失败:
[“ Billie Holiday玩点东西”, “将计时器设置为五分钟”, “再玩一次,山姆”]
我不确定如何使它按我想要的方式工作。
例如,如果我打电话
summarize_texts(["Play it again,Sam","Play something by Billie Holiday"])
output: {'actions': {'play': 2},'entities': {'PERSON': ['Sam','Billie']}}
但是如果我跑步
docs = [
"Play something by Billie Holiday","Set a timer for five minutes","Play it again,Sam"
]
summarize_texts(docs)
output is:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-46-200347d5cac5> in <module>()
4 "Play it again,Sam"
5 ]
----> 6 summarize_texts(docs)
5 frames
/usr/local/lib/python3.6/dist-packages/nltk/tokenize/casual.py in _replace_html_entities(text,keep,remove_illegal,encoding)
257 return "" if remove_illegal else match.group(0)
258
--> 259 return ENT_RE.sub(_convert_entity,_str_to_unicode(text,encoding))
260
261
TypeError: expected string or bytes-like object
解决方法
您可以检查输入的type
!
我在这里检查它是str
还是list
!
之后,如果它是str
,那么我将创建一个仅包含一个句子的列表!
您的输出将是结果列表! [如果只有一个输入,您只需返回一个结果! -可选]
return result[0] if len(result)==1 else result
def preprocess (texts):
case = truecase.get_true_case(texts)
doc = nlp(case)
return doc
def summarize_texts(texts):
if type(texts) is str: texts = [texts]
result = []
for text in texts:
doc = preprocess(text) #another function that took text and processed it as a spacy doc
actions = {}
entities = {}
for token in doc:
if token.pos_ == "VERB":
actions[token.lemma_] = actions.get(token.text,0) +1
for token in doc.ents:
entities[token.label_] = [token.text]
result.append({
'actions': actions,'entities': entities
})
return result
print(summarize_texts("Play it again,Sam"))
print(summarize_texts(["Play something by Billie Holiday","Set a timer for five minutes","Play it again,Sam"]))
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。