微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

对表格中每个日期的恒定时间范围范围内的结果求和

如何解决对表格中每个日期的恒定时间范围范围内的结果求和

我使用的是 PostGres DB。
我有一个包含测试名称、结果和报告时间的表格:

|test_name|result |report_time|
|    A    |error  |29/11/2020 |
|    A    |failure|28/12/2020 |
|    A    |error  |29/12/2020 |
|    B    |passed |30/12/2020 |
|    C    |failure|31/12/2020 |
|    A    |error  |31/12/2020 |

我想总结过去 30 天内每个日期有多少测试失败或出错(并将其限制为距当前日期 5 天后),因此最终结果将是:

|    date    | sum |  (notes)
| 29/11/2020 |  1  | 1 Failed/errored test in range (29/11 -> 29/10)
| 28/12/2020 |  2  | 2 Failed/errored tests in range (28/12 -> 28/11)
| 29/12/2020 |  3  | 3 Failed/errored tests in range (29/12 -> 29/11)
| 30/12/2020 |  2  | 2 Failed/errored tests in range (30/12 -> 30/11)
| 31/12/2020 |  4  | 4 Failed/errored tests in range (31/12 -> 31/11)

我知道如何对每个日期的结果求和(即特定日期有多少失败/错误):

SELECT report_time::date AS "Report Time",count(case when result in ('failure','error') then 1 else 
null end) from table
where report_time::date = Now()::date
GROUP BY report_time::date,'error') then 1 else null end)

但我正在努力总结 30 天前的每个日期。

解决方法

您可以生成日期,然后使用窗口函数:

select gs.dte,num_failed_error,num_failed_error_30
from genereate_series(current_date - interval '5 day',current_date,interval '1 day') gs(dte) left join
     (select t.report_time,count(*) as num_failed_error,sum(count(*)) over (order by report_time range between interval '30 day' preceding and current row) as num_failed_error_30
      from t
      where t.result in ('failed','error') and
            t.report_time >= current_date - interval '35 day'
      group by t.report_time
     ) t
     on t.report_time = gs.dte ;

注意:这里假设 report_time 只是没有时间分量的日期。如果它有时间组件,请使用 report_time::date

如果你每天都有数据,那么这可以简化为:

select t.report_time,sum(count(*)) over (order by report_time range between interval '30 day' preceding and current row) as num_failed_error_30
from t
 where t.result in ('failed','error') and
        t.report_time >= current_date - interval '35 day'
 group by t.report_time
 order by report_time desc
 limit 5;
,

由于我使用的是 PostGresSql 10.12 并且更新目前不是一个选项,我采用了不同的方法,我计算过去 30 天的日期,并为每个日期计算过去 30 天的累积不同总和:

SELECT days_range::date,SUM(number_of_tests)
FROM   generate_series (now() - interval '30 day',now()::timestamp,'1 day'::interval) days_range
CROSS  JOIN LATERAL (
        SELECT environment,COUNT(DISTINCT(test_name)) as number_of_tests from tests
        WHERE report_time > days_range - interval '30 day'
        GROUP BY report_time::date
        HAVING COUNT(case when result in ('failure','error') then 1 else null end) > 0
        ORDER BY report_time::date asc
    ) as lateral_query
GROUP BY days_range
ORDER BY days_range desc

这绝对不是最佳优化查询,它需要大约 1 分钟的时间来计算。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。