微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

从 OrderedDict

如何解决从 OrderedDict

我有一个像这样的 OrderedDict:

OrderedDict([('searchedFolder',{'id': '1uTjm6QEx7No09bgTX984lxmwMSfv2sYK','name': 'Test','mimeType': 'application/vnd.google-apps.folder'}),('folderTree',OrderedDict([('id',[['1uTjm6QEx7No09bgTX984lxmwMSfv2sYK'],['1uTjm6QEx7No09bgTX984lxmwMSfv2sYK','1bfMsEMU7zyILW6sLsTkZhjLLrogcWK8P'],'1jyIXgH7hCOcqdb0ouNsR9EYwsrrjgPC3']]),('names',['Test','Test1','Test2']),('folders','1bfMsEMU7zyILW6sLsTkZhjLLrogcWK8P','1jyIXgH7hCOcqdb0ouNsR9EYwsrrjgPC3'])])),('fileList',[{'files': [{'id': 
'1I0vsHBo8GyWb1Jr30hQflTTZ3eIXpm8x','name': 'test1.xlsx'}],'folderTree': 
['1uTjm6QEx7No09bgTX984lxmwMSfv2sYK']},{'files': [{'id': '1TEBzg_EH9iG9A3i6oN18ZSElUE1EhwxY','name': 'test2.xlsx'}],'folderTree': ['1uTjm6QEx7No09bgTX984lxmwMSfv2sYK','1bfMsEMU7zyILW6sLsTkZhjLLrogcWK8P']},{'files': [{'id': '1jJwFxbKRYRYn4vRzNf62LYL27EfAHSvq','name': 'test3.xlsx'},{'id': '10ReTrPWGr_inWjj_eahFtBmIYtjthw2s','name': 'test4.xlsx'}],'1jyIXgH7hCOcqdb0ouNsR9EYwsrrjgPC3']}]),('totalNumberOfFolders',3),('totalNumberOfFiles',4)])

我想创建一个包含文件名和 ID 的数据框,如下所示:

              id                                      name
0     1I0vsHBo8GyWb1Jr30hQflTTZ3eIXpm8x             test1.xlsx
1     1TEBzg_EH9iG9A3i6oN18ZSElUE1EhwxY             test2.xlsx
2     1jJwFxbKRYRYn4vRzNf62LYL27EfAHSvq             test3.xlsx   
3     10ReTrPWGr_inWjj_eahFtBmIYtjthw2s             test4.xlsx

文件名只是为了测试目的而随机的,我还有其他文件,而不仅仅是 excel(.png、.jpg、.doc 等)

首先我尝试创建一个数据框,然后使用以下方法提取这些值:

df=pd.DataFrame(Ordereddict) or df=pd.DataFrame.from_dict(Ordereddict)

但我收到此错误

ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.

解决方法

一种方式:

df = pd.json_normalize(ord_dict['fileList'],record_path=['files'])

或:

df = pd.DataFrame(ord_dict['fileList'])['files'].explode().apply(pd.Series)

输出:

                                  id        name
0  1I0vsHBo8GyWb1Jr30hQflTTZ3eIXpm8x  test1.xlsx
1  1TEBzg_EH9iG9A3i6oN18ZSElUE1EhwxY  test2.xlsx
2  1jJwFxbKRYRYn4vRzNf62LYL27EfAHSvq  test3.xlsx
2  10ReTrPWGr_inWjj_eahFtBmIYtjthw2s  test4.xlsx

完整代码:

from collections import OrderedDict
ord_dict = OrderedDict([('searchedFolder',{'id': '1uTjm6QEx7No09bgTX984lxmwMSfv2sYK','name': 'Test','mimeType': 'application/vnd.google-apps.folder'}),('folderTree',OrderedDict([('id',[['1uTjm6QEx7No09bgTX984lxmwMSfv2sYK'],['1uTjm6QEx7No09bgTX984lxmwMSfv2sYK','1bfMsEMU7zyILW6sLsTkZhjLLrogcWK8P'],'1jyIXgH7hCOcqdb0ouNsR9EYWsRrjgPC3']]),('names',['Test','Test1','Test2']),('folders','1bfMsEMU7zyILW6sLsTkZhjLLrogcWK8P','1jyIXgH7hCOcqdb0ouNsR9EYWsRrjgPC3'])])),('fileList',[{'files': [{'id': 
'1I0vsHBo8GyWb1Jr30hQflTTZ3eIXpm8x','name': 'test1.xlsx'}],'folderTree': 
['1uTjm6QEx7No09bgTX984lxmwMSfv2sYK']},{'files': [{'id': '1TEBzg_EH9iG9A3i6oN18ZSElUE1EhwxY','name': 'test2.xlsx'}],'folderTree': ['1uTjm6QEx7No09bgTX984lxmwMSfv2sYK','1bfMsEMU7zyILW6sLsTkZhjLLrogcWK8P']},{'files': [{'id': '1jJwFxbKRYRYn4vRzNf62LYL27EfAHSvq','name': 'test3.xlsx'},{'id': '10ReTrPWGr_inWjj_eahFtBmIYtjthw2s','name': 'test4.xlsx'}],'1jyIXgH7hCOcqdb0ouNsR9EYWsRrjgPC3']}]),('totalNumberOfFolders',3),('totalNumberOfFiles',4)])

df = pd.json_normalize(ord_dict['fileList'],record_path=['files'])

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。