如何解决Pandas:关于如何在 Pandas 数据框中加载数据的任何想法
大家好,我是熊猫的新手,
john_age.csv
john_gender.csv
john_weight.csv
mike_age.csv
mike_gender.csv
mike_weight.csv
smith_age.csv
smith_gender.csv
smith_weight.csv
...
...
每个 csv 文件都有一个简单的单个字符串或数字,如下所示:
john_age.csv 54
john_gender.csv male
john_weight.csv 65.4
基本上,我想让整个数据框看起来像这样:
age gender weight
john 54 male 65.4
mike 23 male 86.5
smith 52 female 54
我怎样才能做到这一点?
我认为关键的想法是将每个 csv 文件名合并到数据帧中,但到目前为止我只能使用 glob.glob 和使用 append 函数读取多个 csv 文件,但 append 函数不是解决方案:
csv_path = \mypath\
filenames = glob.glob(csv_path + '\*.csv')
dfs= []
for file in filenames:
dfs.append(pd.read_csv(file))
非常感谢!
解决方法
这将从文件中创建一个数据框。
import glob
import pandas as pd
csv_path = 'csvs'
filenames = glob.glob(csv_path + r'\*_age.csv')
people = []
attrs =['age','gender','weight']
for file in filenames:
person = {}
name = file[5:].split('_')[0]
print(name)
for attr in attrs:
person['name'] = name
with open(f'{csv_path}\{name}_{attr}.csv','r') as data_file:
data = data_file.readline()
person[attr] = data
people.append(person)
df = pd.DataFrame(people)
print(df)
,
这就是我在谈论的事情:
with open('combined.csv','w') as combine:
for fn in glob.glob(csv_path+'\*_age.csv'):
name = os.path.basename(f).split('_')[0]
fields = [name]
for part in ('age','weight'):
fields.append( open(f"{cvs_path}\{name}_{part}.csv").read().strip() )
print( ','.join(fields),file=combine )
dfs = pd.read_cvs('combined.csv')
,
您可以在一行中使用 pd.concat()
from glob import glob
import pandas as pd
files = glob(“path/to/files/*.csv”)
files.sort()
data = pd.concat((pd.read_csv(file) for file in files),ignore_index=True,header=0,names=[“age”,“gender”,“weight”])
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。