如何解决Python-如何解析此嵌套的xml文件?
我正在尝试将此嵌套的xml文件解析为数据框。 这是xml的示例:
ComboBox
我尝试使用xmltodict库,但我提取了单个元素,因为它是嵌套的,并且在我尝试遍历它时在xml中具有多个元素和度量,因此无法正常工作。这是我到目前为止的代码:
<?xml version='1.0' encoding='UTF-8'?>
<response xmlns="http://www...">
<sensor-time timezone="America/New_York">2020-08-10T12:19:26-04:00</sensor-time>
<status>
<code>OK</code>
</status>
<content>
<elements>
<element>
<element-id>0</element-id>
<element-name>Line 0</element-name>
<sensor-type>SINGLE_SENSOR</sensor-type>
<data-type>LINE</data-type>
<from>2020-08-10T10:00:00-04:00</from>
<to>2020-08-10T12:00:00-04:00</to>
<resolution>FIVE_MINUTES</resolution>
<measurements>
<measurement>
<from>2020-08-10T10:00:00-04:00</from>
<to>2020-08-10T10:05:00-04:00</to>
<values>
<value label="fw">0</value>
<value label="bw">0</value>
</values>
</measurement>
<measurement>
<from>2020-08-10T10:05:00-04:00</from>
<to>2020-08-10T10:10:00-04:00</to>
<values>
<value label="fw">0</value>
<value label="bw">0</value>
</values>
</measurement>
<measurement>
<from>2020-08-10T10:10:00-04:00</from>
<to>2020-08-10T10:15:00-04:00</to>
<values>
<value label="fw">0</value>
<value label="bw">0</value>
</values>
</measurement>
<measurement>
</element>
<element>
<element-id>1</element-id>
<element-name>GP Test CL.01</element-name>
<sensor-type>SINGLE_SENSOR</sensor-type>
<data-type>LINE</data-type>
<from>2020-08-10T10:00:00-04:00</from>
<to>2020-08-10T12:00:00-04:00</to>
<resolution>FIVE_MINUTES</resolution>
<measurements>
<measurement>
<from>2020-08-10T10:00:00-04:00</from>
<to>2020-08-10T10:05:00-04:00</to>
<values>
<value label="fw">0</value>
<value label="bw">0</value>
</values>
</measurement>
<measurement>
<from>2020-08-10T10:05:00-04:00</from>
<to>2020-08-10T10:10:00-04:00</to>
<values>
<value label="fw">0</value>
<value label="bw">0</value>
</values>
</measurement>
<measurement>
<from>2020-08-10T10:10:00-04:00</from>
<to>2020-08-10T10:15:00-04:00</to>
<values>
<value label="fw">0</value>
<value label="bw">0</value>
</values>
</measurement>
<measurement>
</element>
</elements>
</content>
<sensor-info>
<serial-number>D1:82:34:5Z:3Q:3D</serial-number>
<ip-address>000.000.00.0</ip-address>
<name>Demo</name>
<group>Test Devices</group>
<device-type>PC2</device-type>
</sensor-info>
</response>
我遇到的错误是:在元素中查找m的开头是“ TypeError:字符串索引必须是整数”。
有任何想法如何使其正常工作吗?
解决方法
要使您至少接近我认为您要去的地方,我会按照以下原则做一些事情(显然,您可以根据需要对其进行修改):
rows = []
for element in doc.xpath('//elements//element'):
row = []
row.extend([(doc.xpath('//sensor-info/ip-address/text()')[0]),(doc.xpath('//sensor-info/name/text()')[0]),(doc.xpath('//sensor-info/device-type/text()')[0]),(doc.xpath('//sensor-info/group/text()')[0]),(element.xpath('.//element-id/text()')[0]),(element.xpath('.//element-name/text()')[0]),(measurement.xpath('//to/text()')[0]),(measurement.xpath('//from/text()')[0]),(measurement.xpath('//values[1]/value[@label="fw"]/text()')[0]),(measurement.xpath('//values[1]/value[@label="bw"]/text()')[0])])
rows.append(row)
pd.DataFrame(rows,columns = cols)
输出(基于问题中的示例xml):
ip sensor name device group elem id elem name to from fw bw
0 000.000.00.0 Demo PC2 Test Devices 0 Line 0 2020-08-10T12:00:00-04:00 2020-08-10T10:00:00-04:00 0 0
1 000.000.00.0 Demo PC2 Test Devices 1 GP Test CL.01 2020-08-10T12:00:00-04:00 2020-08-10T10:00:00-04:00 0 0
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。