遵循序列的python字符串连接

如何解决遵循序列的python字符串连接

创建以下字符串连接的最pythonic 的方法是什么：

我们有一个初始数据框，其中一些列是：

起源
dest_1_country
dest_1_city
dest_2_country
dest_2_city
dest_3_country
dest_3_city
dest_4_country
dest_4_city

我们想创建一个额外的列，它是数据框中每一行的完整路由，并且可以由

df['full_route'] = df['origin].fillna("") + df['dest_1_country].fillna("") + df['dest_1_city].fillna("") + df['dest_2_country ].fillna("") + df['dest_2_city].fillna("") + df['dest_3_country].fillna("") + df['dest_3_city].fillna("") + df['dest_4_country]。 fillna("") + df['dest_4_city].fillna("")

显然这不是获得所需结果的最pythonic方式，因为它有多么麻烦..如果我在df中有100个城市怎么办？

在 python 中实现这一目标的最佳方法是什么？

注意：在数据框中，还有其他与路由无关且不应在串联中考虑的列。

非常感谢！！

解决方法

如果你有这个数据框：

  origin dest_1_country dest_1_city dest_2_country dest_2_city
0      a              b           c              d           e
1      f              g           h              i           j

然后你可以这样做：

df["full_route"] = df.sum(axis=1)  # df.fillna("").sum(axis=1) if you have NaNs
print(df)

要连接所有列：

  origin dest_1_country dest_1_city dest_2_country dest_2_city full_route
0      a              b           c              d           e      abcde
1      f              g           h              i           j      fghij

编辑：如果你想连接“origin”和每个“*city”/“*country”列，你可以这样做：

df["full_route"] = df["origin"].fillna("") + df.filter(
    regex=r"country$|city$"
).fillna("").sum(axis=1)
print(df)