Pandas：关于如何在 Pandas 数据框中加载数据的任何想法

如何解决Pandas：关于如何在 Pandas 数据框中加载数据的任何想法

大家好，我是熊猫的新手，

我有多个类似这样的 CSV 文件：

john_age.csv
john_gender.csv
john_weight.csv
mike_age.csv
mike_gender.csv
mike_weight.csv
smith_age.csv
smith_gender.csv
smith_weight.csv
...
...

每个 csv 文件都有一个简单的单个字符串或数字，如下所示：

john_age.csv       54
john_gender.csv    male
john_weight.csv    65.4

基本上，我想让整个数据框看起来像这样：

        age    gender    weight      
john     54     male      65.4
mike     23     male      86.5
smith    52     female    54

我怎样才能做到这一点？

我认为关键的想法是将每个 csv 文件名合并到数据帧中，但到目前为止我只能使用 glob.glob 和使用 append 函数读取多个 csv 文件，但 append 函数不是解决方案：

csv_path = \mypath\  

filenames = glob.glob(csv_path + '\*.csv')

dfs= []

for file in filenames:
    dfs.append(pd.read_csv(file))

非常感谢！

解决方法

这将从文件中创建一个数据框。

import glob
import pandas as pd

csv_path = 'csvs'  

filenames = glob.glob(csv_path + r'\*_age.csv')

people = []
attrs =['age','gender','weight']

for file in filenames:
    person = {}
    name = file[5:].split('_')[0]
    print(name)
    for attr in attrs:
        person['name'] = name
        with open(f'{csv_path}\{name}_{attr}.csv','r') as data_file:
            data = data_file.readline() 
            person[attr] = data
    people.append(person)
    
df = pd.DataFrame(people)

print(df)

这就是我在谈论的事情：

with open('combined.csv','w') as combine:
    for fn in glob.glob(csv_path+'\*_age.csv'):
        name = os.path.basename(f).split('_')[0]
        fields = [name]
        for part in ('age','weight'):
            fields.append( open(f"{cvs_path}\{name}_{part}.csv").read().strip() )
        print( ','.join(fields),file=combine )

dfs = pd.read_cvs('combined.csv')

您可以在一行中使用 pd.concat()

from glob import glob
import pandas as pd

files = glob(“path/to/files/*.csv”)
files.sort()

data = pd.concat((pd.read_csv(file) for file in files),ignore_index=True,header=0,names=[“age”,“gender”,“weight”])