微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

加载STATA文件:分类值必须唯一

我正在尝试将zip file之后的.dta文件加载到熊猫中.但是,我立即得到一个错误.我的命令中也有stata,但是由于错误消息没有告诉我更多信息,例如错误的列,所以我不知道该怎么做.

如何将文件加载到熊猫中?

>>> df = pd.read_stata('cepr_org_2014.dta')

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas-0.15.2-py2.7-macosx-10.9-x86_64.egg/pandas/io/stata.py", line 69, in read_stata
    order_categoricals)
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas-0.15.2-py2.7-macosx-10.9-x86_64.egg/pandas/io/stata.py", line 1315, in data
    cat_data.categories = categories
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas-0.15.2-py2.7-macosx-10.9-x86_64.egg/pandas/core/categorical.py", line 442, in _set_categories
    categories = self._validate_categories(categories)
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas-0.15.2-py2.7-macosx-10.9-x86_64.egg/pandas/core/categorical.py", line 437, in _validate_categories
    raise ValueError('Categorical categories must be unique')
ValueError: Categorical categories must be unique

解决方法:

使用pandas.read_stata(‘cepr_org_2014.dta’,convert_categoricals = False,convert_missing = True)加载它,并查看数据的外观.
如问题中所述,可以选择使用ipdb进行调试,以表明您的数据中存在重复的类别.

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐