如何解决比较包含日期和时间的数据框中的两列并给出另一列中的差异
datetime1 datetime2
0 2021-05-09 19:52:14 2021-05-09 20:52:14
1 2021-05-09 19:52:14 2021-05-09 21:52:14
我想比较它们并创建一个包含它们之间差异的新列:
理想的输出如下:
datetime1 datetime2 Difference in H:m:s
0 2021-05-09 19:52:14 2021-05-09 20:52:14 01:00:00
1 2021-05-09 19:52:14 2021-05-09 21:52:14 02:00:00
编辑:
@Andrej 当我在 datetime1 和 2 中都有时间戳时,你给我的解决方案工作得很好。如果我有一个像下面这样的 df,它就会失败,因为它没有什么可比较的
df1:
datetime1 datetime2
0 2021-05-09 19:52:14 2021-05-09 20:52:14
1 2021-05-09 19:52:14 2021-05-09 21:52:14
2 NaN NaN
3 2021-05-09 16:30:14 NaN
4 NaN NaN
5 2021-05-09 12:30:14 2021-05-09 14:30:14
df2(理想的输出):
datetime1 datetime2 Difference in H:m:s Compared with datetime.Now()
0 2021-05-09 19:52:14 2021-05-09 20:52:14 01:00:00 NaN
1 2021-05-09 19:52:14 2021-05-09 21:52:14 02:00:00 NaN
2 NaN NaN NaN NaN
3 2021-05-09 16:30:14 NaN NaN e.g(04:00:00)
4 NaN NaN NaN NaN
5 2021-05-09 12:30:14 2021-05-09 14:30:14 02:00:00 NaN
在实际情况中,我有一个情况,我在 datetime1 和 datetime2 中没有值,或者我在 datatime1 中有值但在 datatime2 中没有,所以有没有可能的方法在“差异”中获取 NaN " 如果 datetime1 和 2 中没有时间戳,并且只有 datetime1 中有时间戳,则获取与 datetime.Now() 相比的差异并将其放在另一列中。
解决方法
试试:
def strfdelta(tdelta,fmt):
d = {"days": tdelta.days}
d["hours"],rem = divmod(tdelta.seconds,3600)
d["minutes"],d["seconds"] = divmod(rem,60)
return fmt.format(**d)
# if datetime1/datetime2 aren't already datetime,apply `.to_datetime()`:
df["datetime1"] = pd.to_datetime(df["datetime1"])
df["datetime2"] = pd.to_datetime(df["datetime2"])
df["Difference in H:m:s"] = df.apply(
lambda x: strfdelta(
x["datetime2"] - x["datetime1"],"{hours:02d}:{minutes:02d}:{seconds:02d}",),axis=1,)
print(df)
打印:
datetime1 datetime2 Difference in H:m:s
0 2021-05-09 19:52:14 2021-05-09 20:52:14 01:00:00
1 2021-05-09 19:52:14 2021-05-09 21:52:14 02:00:00
编辑:处理NaN
:
# if datetime1/datetime2 aren't already datetime,)
if pd.notna(x["datetime1"]) and pd.notna(x["datetime2"])
else np.nan,)
df["Compared with datetime.now()"] = df.apply(
lambda x: strfdelta(
pd.Timestamp.now() - x["datetime1"],)
if pd.notna(x["datetime1"]) & pd.isna(x["datetime2"])
else np.nan,)
print(df)
打印:
datetime1 datetime2 Difference in H:m:s Compared with datetime.now()
0 2021-05-09 19:52:14 2021-05-09 20:52:14 01:00:00 NaN
1 2021-05-09 19:52:14 2021-05-09 21:52:14 02:00:00 NaN
2 NaT NaT NaN NaN
3 2021-05-09 16:30:14 NaT NaN 03:00:20
4 NaT NaT NaN NaN
5 2021-05-09 12:30:14 2021-05-09 14:30:14 02:00:00 NaN
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。