微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

无法从发布的xml文件中读取所有抽象文本

如何解决无法从发布的xml文件中读取所有抽象文本

我下载了PubMed XML文件,并想打印出此文件中的所有文章,这是我的代码

import xml.etree.ElementTree as ET
tree = ET.parse('test1.xml')
root = tree.getroot()
for abs_1 in root.findall("PubmedArticle/MedlineCitation/Article/Abstract"):
    abs_2 = abs_1.find('AbstractText').text
    print(abs_2)

但是,我只得到摘要的客观部分。标记<AbstractText Label="aim" NlmCategory="OBJECTIVE">,我没有得到另外两部分也位于<Abstract>内。

forxample XML得到了类似的东西

<Abstract>
<AbstractText Label="aim" NlmCategory="OBJECTIVE">The level of preparedness of the healthcare system plays an important role in management of coronavirus disease 2019 (COVID-19). This study attempted to devise a comprehensive protocol regarding dental care during the COVID-19 outbreak.</AbstractText>
<AbstractText Label="METHODS AND RESULT" NlmCategory="RESULTS">Embase,PubMed,and Google Scholar were searched until march 2020 for relevant papers. Sixteen English papers were enrolled to answer questions about procedures that are allowed to perform during the COVID-19 outbreak,patients who are in priority to receive dental care services,the conditions and necessities for patient admission,waiting room and operatory room,and personal protective equipment (PPE) that is necessary for dental clinicians and the office staff.</AbstractText>
<AbstractText Label="CONCLUSION" NlmCategory="CONCLUSIONS">Dental treatment should be limited to patients with urgent or emergency situation. By screening questionnaires for COVID-19,patients are divided into three groups of (a) apparently healthy,(b) SUSPECTed for COVID-19,and (c) confirmed for COVID-19. Separate waiting and operating rooms should be assigned to each group of patients to minimize the risk of disease transmission. All groups should be treated with the same protective measures with regard to PPE for the dental clinicians and staff.</AbstractText>
<copyrightinformation>© 2020 Special Care Dentistry Association and Wiley Periodicals,Inc.</copyrightinformation>
</Abstract>

使用我的代码我只会得到

The level of preparedness of the healthcare system plays an important role in management of coronavirus disease 2019 (COVID-19). This study attempted to devise a comprehensive protocol regarding dental care during the COVID-19 outbreak.

真的需要一些有关如何打印出摘要中所有abstracttext的帮助

解决方法

当您可以.findall() <Abstract>个元素时,以相同的方式可以.findall() <AbstractText>个元素是不合逻辑的吗?

import xml.etree.ElementTree as ET

tree = ET.parse('test1.xml')
root = tree.getroot()

for AbstractText in root.findall("PubmedArticle/MedlineCitation/Article/Abstract/AbstractText"):
    print(AbstractText.text)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。