Postgresql 限制 1 导致大量循环

如何解决Postgresql 限制 1 导致大量循环

我们有一个交易表,大约。 10m 行且不断增长。我们的每个客户都指定了许多规则,这些规则根据位置、相关产品、销售客户等将某些交易组合在一起。根据这些规则,我们每晚生成报告,使他们能够查看客户为产品支付的价格与他们的购买价格从不同的价目表来看,这些价目表每天都在变化,我们必须在交易的每个日期找到它们设定的年度价格或交易日期的有效价格。

这些价​​目表可能会随着历史变化而不断变化,就像添加的新历史交易一样,因此我们必须在每个财政年度内继续重新生成这些报告。

我们在必须执行的两种价目表/价格连接方面遇到了问题。第一个是在设定的年度价目表上。

我删除了将事务引入并放入名为 transaction_data_6787 的表中的查询。

EXPLAIN analyze
SELECT *
FROM transaction_data_6787 t
inner JOIN LATERAL
(
    SELECT p."Price"
    FROM "Prices" p 
    INNER JOIN "PriceLists" pl on p."PriceListId" = pl."Id"
    WHERE (pl."CustomerId" = 20)
    AND (pl."Year" = 2020)
    AND (pl."PriceListTypeId" = 2)
    AND p."ProductId" = t.product_id
    limit 1
) AS prices ON true
Nested Loop  (cost=0.70..133877.20 rows=5394 width=165) (actual time=0.521..193.638 rows=5394 loops=1)   ->  Seq Scan on transaction_data_6787 t  (cost=0.00..159.94 rows=5394 width=145) (actual time=0.005..0.593 rows=5394 loops=1)   ->  Limit  (cost=0.70..24.77 rows=1 width=20) (actual time=0.035..0.035 rows=1 loops=5394)
        ->  Nested Loop  (cost=0.70..24.77 rows=1 width=20) (actual time=0.035..0.035 rows=1 loops=5394)
              ->  Index Scan using ix_prices_covering on "Prices" p  (cost=0.42..8.44 rows=1 width=16) (actual time=0.006..0.015 rows=23 loops=5394)
                    Index Cond: (("ProductId" = t.product_id))
              ->  Index Scan using ix_pricelists_covering on "PriceLists" pl  (cost=0.28..8.30 rows=1 width=12) (actual time=0.001..0.001 rows=0 loops=122443)
                    Index Cond: (("Id" = p."PriceListId") AND ("CustomerId" = 20) AND ("PriceListTypeId" = 2))
                    Filter: ("Year" = 2020)
                    Rows Removed by Filter: 0 Planning Time: 0.307 ms Execution Time: 193.982 ms

如果我删除 LIMIT 1,执行时间会下降到 3 毫秒,并且不会发生 ix_pricelists_covering 上的 122443 次循环。我们进行横向连接的原因是价格查询是动态构建的,有时当不加入年度价格表时,我们加入有效价格表。如下所示:

EXPLAIN analyze
SELECT *
FROM transaction_data_6787 t
inner JOIN LATERAL
(
    SELECT p."Price"
    FROM "Prices" p 
    INNER JOIN "PriceLists" pl on p."PriceListId" = pl."Id"
    WHERE (pl."CustomerId" = 20)
    AND (pl."PriceListTypeId" = 1)
    AND p."ProductId" = t.product_id
    and pl."ValidFromDate" <= t.transaction_date
    ORDER BY pl."ValidFromDate" desc
    limit 1
) AS prices ON true

这正在扼杀我们的性能,有些查询需要 20 秒,而且当我们不按日期 desc/limit 1 订购时,它会在 ms 内完成,但我们可能会得到重复的价格。

如果有更好的方式加入最新记录,我们很乐意重写。我们有数千个价目表和 10 万个价格,每笔交易可能有 100 个甚至 1000 个有效价格,我们需要确保我们获得在交易日期对产品最近有效的价格。>

我发现如果我将价格表/价格非规范化到一个表中并添加一个带有 ValidFromDate DESC 的索引,它似乎消除了循环,但我对非规范化并不得不维护该数据犹豫不决,这些报告可以临时运行以及批处理作业,我们必须实时维护这些数据。

更新解释/分析:

我在查询下方添加了需要获取交易日期最近生效的价格。我现在看到,当

我仍然看到执行大量循环的较慢查询,200k+(当包含限制 1/

也许更好的问题是我们可以做什么而不是横向连接,这将使我们能够以最有效/最高效的方式连接交易的有效价格。我希望避免对数据进行非规范化和维护,但如果这是我们做的唯一方法。如果有一种方法可以重写它而不是非规范化,那么我真的很感激任何见解。

Nested Loop  (cost=14.21..76965.60 rows=5394 width=10) (actual time=408.948..408.950 rows=0 loops=1)
  Output: t.transaction_id,pr."Price"
  Buffers: shared hit=688022
  ->  Seq Scan on public.transaction_data_6787 t  (cost=0.00..159.94 rows=5394 width=29) (actual time=0.018..0.682 rows=5394 loops=1)
        Output: t.transaction_id
        Buffers: shared hit=106
  ->  Limit  (cost=14.21..14.22 rows=1 width=10) (actual time=0.075..0.075 rows=0 loops=5394)
        Output: pr."Price",pl."ValidFromDate"
        Buffers: shared hit=687916
        ->  Sort  (cost=14.21..14.22 rows=1 width=10) (actual time=0.075..0.075 rows=0 loops=5394)
              Output: pr."Price",pl."ValidFromDate"
              Sort Key: pl."ValidFromDate" DESC
              Sort Method: quicksort  Memory: 25kB
              Buffers: shared hit=687916
              ->  Nested Loop  (cost=0.70..14.20 rows=1 width=10) (actual time=0.074..0.074 rows=0 loops=5394)
                    Output: pr."Price",pl."ValidFromDate"
                    Inner Unique: true
                    Buffers: shared hit=687916
                    ->  Index Only Scan using ix_prices_covering on public."Prices" pr  (cost=0.42..4.44 rows=1 width=10) (actual time=0.007..0.019 rows=51 loops=5394)
                          Output: pr."ProductId",pr."ValidFromDate",pr."Id",pr."Price",pr."PriceListId"
                          Index Cond: (pr."ProductId" = t.product_id)
                          Heap Fetches: 0
                          Buffers: shared hit=17291
                    ->  Index Scan using ix_pricelists_covering on public."PriceLists" pl  (cost=0.28..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=273678)
                          Output: pl."Id",pl."Name",pl."CustomerId",pl."ValidFromDate",pl."PriceListTypeId"
                          Index Cond: ((pl."Id" = pr."PriceListId") AND (pl."CustomerId" = 20) AND (pl."PriceListTypeId" = 1))
                          Filter: (pl."ValidFromDate" <= t.transaction_date)
                          Rows Removed by Filter: 0
                          Buffers: shared hit=670625
Planning Time: 1.254 ms
Execution Time: 409.088 ms


Gather  (cost=6395.67..7011.99 rows=68 width=10) (actual time=92.481..92.554 rows=0 loops=1)
  Output: t.transaction_id,pr."Price"
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=1466 read=2
  ->  Hash Join  (cost=5395.67..6005.19 rows=28 width=10) (actual time=75.126..75.129 rows=0 loops=3)
        Output: t.transaction_id,pr."Price"
        Inner Unique: true
        Hash Cond: (pr."PriceListId" = pl."Id")
        Join Filter: (pl."ValidFromDate" <= t.transaction_date)
        Rows Removed by Join Filter: 41090
        Buffers: shared hit=1466 read=2
        Worker 0: actual time=64.707..64.709 rows=0 loops=1
          Buffers: shared hit=462
        Worker 1: actual time=72.545..72.547 rows=0 loops=1
          Buffers: shared hit=550 read=1
        ->  Merge Join  (cost=5374.09..5973.85 rows=3712 width=18) (actual time=26.804..61.492 rows=91226 loops=3)
              Output: t.transaction_id,t.transaction_date,pr."PriceListId"
              Merge Cond: (pr."ProductId" = t.product_id)
              Buffers: shared hit=1325 read=2
              Worker 0: actual time=17.677..51.590 rows=83365 loops=1
                Buffers: shared hit=400
              Worker 1: actual time=24.995..59.395 rows=103814 loops=1
                Buffers: shared hit=488 read=1
              ->  Parallel Index Only Scan using ix_prices_covering on public."Prices" pr  (cost=0.42..7678.38 rows=79544 width=29) (actual time=0.036..12.136 rows=42281 loops=3)
                    Output: pr."ProductId",pr."PriceListId"
                    Heap Fetches: 0
                    Buffers: shared hit=989 read=2
                    Worker 0: actual time=0.037..9.660 rows=36873 loops=1
                      Buffers: shared hit=285
                    Worker 1: actual time=0.058..13.459 rows=47708 loops=1
                      Buffers: shared hit=373 read=1
              ->  Sort  (cost=494.29..507.78 rows=5394 width=29) (actual time=9.037..14.700 rows=94555 loops=3)
                    Output: t.transaction_id,t.product_id,t.transaction_date
                    Sort Key: t.product_id
                    Sort Method: quicksort  Memory: 614kB
                    Worker 0:  Sort Method: quicksort  Memory: 614kB
                    Worker 1:  Sort Method: quicksort  Memory: 614kB
                    Buffers: shared hit=336
                    Worker 0: actual time=6.608..12.034 rows=86577 loops=1
                      Buffers: shared hit=115
                    Worker 1: actual time=8.973..14.598 rows=107126 loops=1
                      Buffers: shared hit=115
                    ->  Seq Scan on public.transaction_data_6787 t  (cost=0.00..159.94 rows=5394 width=29) (actual time=0.020..2.948 rows=5394 loops=3)
                          Output: t.transaction_id,t.transaction_date
                          Buffers: shared hit=318
                          Worker 0: actual time=0.017..2.078 rows=5394 loops=1
                            Buffers: shared hit=106
                          Worker 1: actual time=0.027..2.976 rows=5394 loops=1
                            Buffers: shared hit=106
        ->  Hash  (cost=21.21..21.21 rows=30 width=8) (actual time=0.145..0.145 rows=35 loops=3)
              Output: pl."Id",pl."ValidFromDate"
              Buckets: 1024  Batches: 1  Memory Usage: 10kB
              Buffers: shared hit=53
              Worker 0: actual time=0.137..0.138 rows=35 loops=1
                Buffers: shared hit=18
              Worker 1: actual time=0.149..0.150 rows=35 loops=1
                Buffers: shared hit=18
              ->  Bitmap Heap Scan on public."PriceLists" pl  (cost=4.59..21.21 rows=30 width=8) (actual time=0.067..0.114 rows=35 loops=3)
                    Output: pl."Id",pl."ValidFromDate"
                    Recheck Cond: (pl."CustomerId" = 20)
                    Filter: (pl."PriceListTypeId" = 1)
                    Rows Removed by Filter: 6
                    Heap Blocks: exact=15
                    Buffers: shared hit=53
                    Worker 0: actual time=0.068..0.108 rows=35 loops=1
                      Buffers: shared hit=18
                    Worker 1: actual time=0.066..0.117 rows=35 loops=1
                      Buffers: shared hit=18
                    ->  Bitmap Index Scan on "IX_PriceLists_CustomerId"  (cost=0.00..4.58 rows=41 width=0) (actual time=0.049..0.049 rows=41 loops=3)
                          Index Cond: (pl."CustomerId" = 20)
                          Buffers: shared hit=8
                          Worker 0: actual time=0.053..0.054 rows=41 loops=1
                            Buffers: shared hit=3
                          Worker 1: actual time=0.048..0.048 rows=41 loops=1
                            Buffers: shared hit=3
Planning Time: 2.236 ms
Execution Time: 92.814 ms

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)&gt; insert overwrite table dwd_trade_cart_add_inc &gt; select data.id, &gt; data.user_id, &gt; data.course_id, &gt; date_format(
错误1 hive (edu)&gt; insert into huanhuan values(1,&#39;haoge&#39;); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive&gt; show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 &lt;configuration&gt; &lt;property&gt; &lt;name&gt;yarn.nodemanager.res