升级到版本 11 后 Postgres 查询使磁盘 IO 变高

如何解决升级到版本 11 后 Postgres 查询使磁盘 IO 变高

从 RDS 9.6 版升级到 RDS 11 版后，Postgres 查询开始谈论高读取 IOPS 和 cpu。数据集与升级前相同。不知道是什么问题。

以下是解释计划：

会不会是因为索引被破坏了？

Explain (analyze true,verbose true,costs true,buffers true,timing true )
select consumertr0_.ref_id as col_0_0_
from consumer_transactions consumertr0_
where (consumertr0_.remaining_amount is not null)
  and (consumertr0_.expiry_time is not null)
  and consumertr0_.expiry_time>'2020-12-15T00:00:00'
  and consumertr0_.expiry_time<Now()
  and consumertr0_.remaining_amount>0
order by consumertr0_.expiry_time asc
limit 20000;

                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..80391.67 rows=20000 width=24) (actual time=191716.213..192077.229 rows=20000 loops=1)
   Output: ref_id,expiry_time
   Buffers: shared hit=9481343 read=1566736
   I/O Timings: read=97.486
   ->  Index Scan using consumer_transactions_expiry_time_remaining_amount on public.consumer_transactions consumertr0_  (cost=0.57..1109723.40 rows=276081 width=24) (actual time=191716.211..192075.241 rows=20000 loops=1)
         Output: ref_id,expiry_time
         Index Cond: ((consumertr0_.expiry_time > '2020-12-15 00:00:00'::timestamp without time zone) AND (consumertr0_.expiry_time < Now()))
         Buffers: shared hit=9481343 read=1566736
         I/O Timings: read=97.486
 Planning Time: 1.525 ms
 Execution Time: 192078.720 ms
(11 rows)

索引定义：

"consumer_transactions_expiry_time_remaining_amount" btree
   (expiry_time,remaining_amount)
WHERE expiry_time IS NOT NULL
  AND remaining_amount IS NOT NULL
  AND remaining_amount > 0::numeric

分析细节：

        relname        |         last_analyze          |       last_autoanalyze      
 consumer_transactions | 2021-01-24 22:00:03.144379+00 |

同样数量的记录在早期以低 IOPS 要求处理的非常快。虽然我没有以前版本9.6的解释计划。

解决方案： 我用不同的名称创建了相同的索引，它解决了这个问题。一旦我确定为什么旧索引在升级后突然变得如此缓慢，我就会删除旧索引。

用新索引解释计划：

explain (analyze true,timing true )  select consumertr0_.ref_id as col_0_0_ from consumer_transactions consumertr0_ where (consumertr0_.remaining_amount is not null) and (consumertr0_.expiry_time is not null) and consumertr0_.expiry_time>'2019-07-01T00:00:00' and consumertr0_.expiry_time<Now() and consumertr0_.remaining_amount>0 order by consumertr0_.expiry_time asc limit 20000;

            
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..73592.06 rows=20000 width=24) (actual time=0.048..18.307 rows=20000 loops=1)
   Output: ref_id,expiry_time
   Buffers: shared hit=11140
   ->  Index Scan using consumer_transactions_expiry_time_remaining_amount2 on public.consumer_transactions consumertr0_  (cost=0.57..22273478.26 rows=6053275 width=24) (actual time=0.047..16.119 rows=20000 loops=1)
         Output: ref_id,expiry_time
         Index Cond: ((consumertr0_.expiry_time > '2019-07-01 00:00:00'::timestamp without time zone) AND (consumertr0_.expiry_time < Now()))
         Buffers: shared hit=11140
 Planning Time: 1.160 ms
 Execution Time: 19.600 ms
(9 rows)

(END)

旧的解释计划直接从实际时间 191716.211 开始，而新的解释计划从 0.047 开始。我不明白 191716.211 之前实际花费的时间在哪里。

仅供参考： 索引膨胀详情：

 current_database | schemaname |        tblname        |                       idxname                       |  real_size  | extra_size  |    extra_ratio    | fillfactor | bloat_size  |    bloat_ratio    
| is_na 
------------------+------------+-----------------------+-----------------------------------------------------+-------------+-------------+-------------------+------------+-------------+-------------------
+-------
| f
 proddb            | public     | consumer_transactions | consumer_transactions_expiry_time_remaining_amount  |  4748820480 |  3698360320 |  77.8795563145819 |         90 |  3583516672 |  75.4611947765185 | f
 proddb            | public     | consumer_transactions | consumer_transactions_expiry_time_remaining_amount2 |  1755013120 |   704552960 |  40.1451676896866 |         90 |   589709312 |  33.6014190024973 | f

解决方法

旧索引非常臃肿：扫描它必须查看 11048079 个 8kB 块（并从磁盘读取其中的 1566736 个）才能找到匹配的行，而新索引只需要查看 11140 个块。

我不确定索引是如何进入这种状态的。

第二个索引列似乎没什么用。

此查询的完美索引是：

CREATE INDEX ON public.consumer_transactions (expiry_time) INCLUDE (ref_id)
WHERE remaining_amount > 0;

如果您 VACUUM 表，您将获得快速的仅索引扫描。