如何解决Pandas AttributeError: 'DataFrame' 对象没有属性 'Timestamp'
所以我想用我的脚本获得每月的总和,但我总是得到一个 AttributeError,我不明白。 Timestamp 列确实存在于我的 combine_csv 中。
我确信这一行是导致问题的原因,因为我之前测试了我的其他代码。
AttributeError: 'DataFrame' 对象没有属性 'Timestamp'
我会很感激我能得到的每一种帮助 - 谢谢
import os
import glob
import pandas as pd
# set working directory
os.chdir("Path to CSVs")
# find all csv files in the folder
# use glob pattern matching -> extension = 'csv'
# save result in list -> all_filenames
extension = 'csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
# print(all_filenames)
# combine all files in the list
combined_csv = pd.concat([pd.read_csv(f,sep=';') for f in all_filenames])
# Format CSV
# Transform Timestamp column into datetime
combined_csv['Timestamp'] = pd.to_datetime(combined_csv.Timestamp)
# Read out first entry of every day of every month
combined_csv = round(combined_csv.resample('D',on='Timestamp')['HtmDht_Energy'].agg(['first']))
# To get the yield of day i have to subtract day 2 HtmDht_Energy - day 1 HtmDht_Energy
combined_csv["dailyYield"] = combined_csv["first"] - combined_csv["first"].shift()
# combined_csv.reset_index()
# combined_csv.index.set_names(["year","month"],inplace=True)
combined_csv["monthlySum"] = combined_csv.groupby([combined_csv.Timestamp.dt.year,combined_csv.Timestamp.dt.month]).sum()
combined_csv.columns 的输出
Index(['Timestamp','teHst0101','teHst0102','teHst0103','teHst0104','teHst0105','teHst0106','teHst0107','teHst0201','teHst0202','teHst0203','teHst0204','teHst0301','teHst0302','teHst0303','teHst0304','teAmb','teSolFloHexHst','teSolRetHexHst','teSolCol0501','teSolCol1001','teSolCol1501','vfSol','prSolRetSuc','rdGlobalColAngle','gSolPump01_roActual','gSolPump02_roActual','gHstPump03_roActual','gHstPump04_roActual','gDhtPump06_roActual','gMB01_isOpened','gMB02_isOpened','gCV01_posActual','gCV02_posActual','HtmDht_Energy','HtmDht_Flow','HtmDht_Power','HtmDht_Volume','HtmDht_teFlow','HtmDht_teReturn','HtmHst_Energy','HtmHst_Flow','HtmHst_Power','HtmHst_Volume','HtmHst_teFlow','HtmHst_teReturn','teSolColDes','teHstFloDes'],dtype='object')
追溯:
当我用
选择它时
combine_csv["monthlySum"] = combine_csv.groupby([combined_csv['Timestamp'].dt.year,combined_csv['Timestamp'].dt.month]).sum()
Traceback (most recent call last):
File "D:\Users\wink\PycharmProjects\csvToExcel\main.py",line 28,in <module>
combined_csv["monthlySum"] = combined_csv.groupby([combined_csv['Timestamp'].dt.year,combined_csv['Timestamp'].dt.month]).sum()
File "D:\Users\wink\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\frame.py",line 3024,in __getitem__
indexer = self.columns.get_loc(key)
File "D:\Users\wink\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\indexes\base.py",line 3082,in get_loc
raise KeyError(key) from err
KeyError: 'Timestamp'
使用mustafas 解决方案回溯
Traceback (most recent call last):
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\frame.py",line 3862,in reindexer
value = value.reindex(self.index)._values
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\util\_decorators.py",line 312,in wrapper
return func(*args,**kwargs)
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\frame.py",line 4176,in reindex
return super().reindex(**kwargs)
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\generic.py",line 4811,in reindex
return self._reindex_axes(
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\frame.py",line 4022,in _reindex_axes
frame = frame._reindex_index(
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\frame.py",line 4038,in _reindex_index
new_index,indexer = self.index.reindex(
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\indexes\multi.py",line 2492,in reindex
target = MultiIndex.from_tuples(target)
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\indexes\multi.py",line 175,in new_meth
return meth(self_or_cls,*args,**kwargs)
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\indexes\multi.py",line 531,in from_tuples
arrays = list(lib.tuples_to_object_array(tuples).T)
File "pandas\_libs\lib.pyx",line 2527,in pandas._libs.lib.tuples_to_object_array
ValueError: Buffer dtype mismatch,expected 'Python object' but got 'long long'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\winklerm\PycharmProjects\csvToExcel\main.py",in <module>
combined_csv["monthlySum"] = combined_csv.groupby([combined_csv.Timestamp.dt.year,combined_csv.Timestamp.dt.month]).sum()
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\frame.py",line 3163,in __setitem__
self._set_item(key,value)
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\frame.py",line 3242,in _set_item
value = self._sanitize_column(key,line 3888,in _sanitize_column
value = reindexer(value).T
File "C:\Users\winklerm\PycharmProjects\csvToExcel\venv\lib\site-packages\pandas\core\frame.py",line 3870,in reindexer
raise TypeError(
TypeError: incompatible index of inserted column with frame index
解决方法
这一行使 Timestamp
列成为 combined_csv
的索引:
combined_csv = round(combined_csv.resample('D',on='Timestamp')['HtmDht_Energy'].agg(['first']))
因此当您尝试访问 .Timestamp
时会出现错误。
补救方法是reset_index
,所以你可以试试这个:
combined_csv = round(combined_csv.resample('D',on='Timestamp')['HtmDht_Energy'].agg(['first'])).reset_index()
这会将 Timestamp
列带回索引中的普通列,然后您可以访问它。
边注:
combined_csv["dailyYield"] = combined_csv["first"] - combined_csv["first"].shift()
相当于
combined_csv["dailyYield"] = combined_csv["first"].diff()
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。