如何解决postgres NOW函数花费的时间过长vs等效于字符串
这是我关于StackOverflow的第一个问题,因此请原谅,如果该问题的结构可能不正确。
我有一个带有t日期时间列d_datetime的表t_table,我需要过滤过去5天之间的数据。以下两项在本地我的数据较少的地方工作:
查询1。
SELECT * FROM t_table
WHERE d_datetime
BETWEEN '2020-08-28T00:00:00.024Z' AND '2020-09-02T00:00:00.024Z';
查询2。
SELECT * FROM t_table
WHERE d_datetime
BETWEEN (NOW() - INTERVAL '5 days') AND NOW();
查询3。
SELECT * FROM t_table
WHERE d_datetime > NOW() - INTERVAL '5 days';
但是,当我移至实时数据库时,只有第一个查询在大约10秒钟内运行完毕。我不知道为什么,但是其他两个只是消耗了太多的处理能力,即使最后等待了5分钟,我也从未见过它们能完成。
我尝试使用以下方法自动生成用于第一个查询中显示的d_datetime的字符串:
查询4。
SELECT * FROM t_table
WHERE d_datetime
BETWEEN
(TO_CHAR(NOW() - INTERVAL '5 days','YYYY-MM-ddThh:MI:SS.024Z'))
AND
(TO_CHAR(NOW(),'YYYY-MM-ddThh:MI:SS.024Z'))
但是会引发以下错误:
operator does not exist: timestamp without time zone >= text
我的问题是:
- 查询1如此之快而其余的要花费大量时间才能在大型数据集上运行是否有任何特殊原因?
- 当查询4实际上生成与查询1相同的字符串格式时,为什么查询4失败('YYYY-MM-ddThh:mm:ss.024Z')?
以下是对第一个查询的解释结果的结果
EXPLAIN SELECT * FROM t_table
WHERE d_datetime
BETWEEN '2020-08-28T00:00:00.024Z' AND '2020-09-02T00:00:00.024Z';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize HashAggregate (cost=31346.37..31788.13 rows=35341 width=22) (actual time=388.622..388.845 rows=6 loops=1)
Output: count(_hyper_12_67688_chunk.octets),_hyper_12_67688_chunk.application,(date_trunc('day'::text,_hyper_12_67688_chunk.entry_time))
Group Key: (date_trunc('day'::text,_hyper_12_67688_chunk.entry_time)),_hyper_12_67688_chunk.application
Buffers: shared hit=17193
-> Gather (cost=27105.45..31081.31 rows=35341 width=22) (actual time=377.109..398.285 rows=11 loops=1)
Output: _hyper_12_67688_chunk.application,(PARTIAL count(_hyper_12_67688_chunk.octets))
Workers Planned: 1
Workers Launched: 1
Buffers: shared hit=17193
-> Partial HashAggregate (cost=26105.45..26547.21 rows=35341 width=22) (actual time=174.272..174.535 rows=6 loops=2)
Output: _hyper_12_67688_chunk.application,PARTIAL count(_hyper_12_67688_chunk.octets)
Group Key: date_trunc('day'::text,_hyper_12_67688_chunk.entry_time),_hyper_12_67688_chunk.application
Buffers: shared hit=17193
Worker 0: actual time=27.942..28.206 rows=5 loops=1
Buffers: shared hit=579
-> Result (cost=1.73..25272.75 rows=111027 width=18) (actual time=0.805..141.094 rows=94662 loops=2)
Output: _hyper_12_67688_chunk.application,date_trunc('day'::text,_hyper_12_67688_chunk.octets
Buffers: shared hit=17193
Worker 0: actual time=1.576..23.928 rows=6667 loops=1
Buffers: shared hit=579
-> Parallel Append (cost=1.73..23884.91 rows=111027 width=18) (actual time=0.800..114.488 rows=94662 loops=2)
Buffers: shared hit=17193
Worker 0: actual time=1.572..20.204 rows=6667 loops=1
Buffers: shared hit=579
-> Parallel Bitmap Heap Scan on _timescaledb_internal._hyper_12_67688_chunk (cost=1.73..11.23 rows=8 width=17) (actual time=1.570..1.618 rows=16 loops=1)
Output: _hyper_12_67688_chunk.octets,_hyper_12_67688_chunk.entry_time
Recheck Cond: ((_hyper_12_67688_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_67688_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone))
Filter: ((_hyper_12_67688_chunk.application)::text = 'dns'::text)
Rows Removed by Filter: 32
Buffers: shared hit=11
Worker 0: actual time=1.570..1.618 rows=16 loops=1
Buffers: shared hit=11
-> Bitmap Index Scan on _hyper_12_67688_chunk_dpi_applications_entry_time_idx (cost=0.00..1.73 rows=48 width=0) (actual time=1.538..1.538 rows=48 loops=1)
Index Cond: ((_hyper_12_67688_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_67688_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone))
Buffers: shared hit=2
Worker 0: actual time=1.538..1.538 rows=48 loops=1
Buffers: shared hit=2
-> Parallel Index Scan Backward using _hyper_12_64752_chunk_dpi_applications_entry_time_idx on _timescaledb_internal._hyper_12_64752_chunk (cost=0.14..2.36 rows=1 width=44) (actual time=0.040..0.076 rows=52 loops=1)
Output: _hyper_12_64752_chunk.octets,_hyper_12_64752_chunk.application,_hyper_12_64752_chunk.entry_time
Index Cond: ((_hyper_12_64752_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_64752_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone))
Filter: ((_hyper_12_64752_chunk.application)::text = 'dns'::text)
Rows Removed by Filter: 52
Buffers: shared hit=
-- cut logs
-> Parallel Seq Scan on _timescaledb_internal._hyper_12_64814_chunk (cost=0.00..2.56 rows=14 width=17) (actual time=0.017..0.038 rows=32 loops=1)
Output: _hyper_12_64814_chunk.octets,_hyper_12_64814_chunk.application,_hyper_12_64814_chunk.entry_time
Filter: ((_hyper_12_64814_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_64814_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone) AND ((_hyper_12_64814_chunk.application)::text = 'dns'::text))
Rows Removed by Filter: 40
Buffers: shared hit=2
-> Parallel Seq Scan on _timescaledb_internal._hyper_12_62262_chunk (cost=0.00..2.54 rows=9 width=19) (actual time=0.027..0.039 rows=15 loops=1)
Output: _hyper_12_62262_chunk.octets,_hyper_12_62262_chunk.application,_hyper_12_62262_chunk.entry_time
Filter: ((_hyper_12_62262_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_62262_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone) AND ((_hyper_12_62262_chunk.application)::text = 'dns'::text))
Rows Removed by Filter: 37
Buffers: shared hit=2
Planning Time: 3367.445 ms
Execution Time: 417.245 ms
(7059 rows)
Parallel Index Scan Backward using...
日志将继续处理表中的所有超表块。
对于前面提到的其他三个查询不成功,查询时它们仍未完成,最终最终会填满内存。因此,抱歉,我无法发布这些查询的EXPLAIN
结果。
如果我的问题结构不正确,请告诉我。谢谢。
解决方法
您使用的分区表可能有很多分区,因为查询的计划时间为3秒。
您可能正在使用PostgreSQL v11或更早版本。第12版在执行时引入了分区修剪,而第11版只能在查询计划时排除分区。
在您的第一个查询中,WHERE
条件包含常量,因此可以正常工作。在其他查询中,使用函数now()
,其结果值仅在查询执行时才知道(它是STABLE
,而不是IMMUTABLE
),因此无法在查询时进行分区修剪计划时间。查询计划和执行不必同时进行-考虑准备好的语句。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。