微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

为什么和如何 SQL 条件不在 psql 解释计划中?

如何解决为什么和如何 SQL 条件不在 psql 解释计划中?

我尝试分析连接查询基准 https://github.com/gregrahn/join-order-benchmark

的计划

例如我执行如下命令:

EXPLAIN SELECT *
FROM aka_name AS an,cast_info AS ci,company_name AS cn,keyword AS k,movie_companies AS mc,movie_keyword AS mk,name AS n,title AS t
WHERE an.person_id = n.id
  AND n.id = ci.person_id
  AND ci.movie_id = t.id
  AND t.id = mk.movie_id
  AND mk.keyword_id = k.id
  AND t.id = mc.movie_id
  AND mc.company_id = cn.id
  AND an.person_id = ci.person_id
  AND ci.movie_id = mc.movie_id
  AND ci.movie_id = mk.movie_id
  AND mc.movie_id = mk.movie_id;

结果,我得到了以下查询计划

                                                       QUERY PLAN                                                                                 [0/1803]
------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1973375.70..22192463.47 rows=22337517790 width=449)
   Hash Cond: (ci.movie_id = t.id)
   ->  Merge Join  (cost=102.03..2617413.84 rows=88800840 width=203)
         Merge Cond: (n.id = an.person_id)
         ->  Merge Join  (cost=0.87..2341713.60 rows=36244344 width=130)
               Merge Cond: (ci.person_id = n.id)
               ->  Index Scan using person_id_cast_info on cast_info ci  (cost=0.44..1714393.60 rows=36244344 width=56)
               ->  Index Scan using name_pkey on name n  (cost=0.43..163847.25 rows=4167379 width=74)
         ->  Materialize  (cost=0.42..69770.80 rows=901343 width=73)
               ->  Index Scan using person_id_aka_name on aka_name an  (cost=0.42..67517.44 rows=901343 width=73)
   ->  Hash  (cost=834975.33..834975.33 rows=24906348 width=246)
         ->  Hash Join  (cost=486218.85..834975.33 rows=24906348 width=246)
               Hash Cond: (mk.movie_id = t.id)
               ->  Hash Join  (cost=4885.82..131552.82 rows=4523930 width=37)
                     Hash Cond: (mk.keyword_id = k.id)
                     ->  Seq Scan on movie_keyword mk  (cost=0.00..69693.30 rows=4523930 width=12)
                     ->  Hash  (cost=2290.70..2290.70 rows=134170 width=25)
                           ->  Seq Scan on keyword k  (cost=0.00..2290.70 rows=134170 width=25)
               ->  Hash  (cost=372278.91..372278.91 rows=2609129 width=209)
                     ->  Hash Join  (cost=141184.56..372278.91 rows=2609129 width=209)
                           Hash Cond: (mc.movie_id = t.id)
                           ->  Hash Join  (cost=11266.43..106748.81 rows=2609129 width=115)
                                 Hash Cond: (mc.company_id = cn.id)
                                 ->  Seq Scan on movie_companies mc  (cost=0.00..44881.29 rows=2609129 width=40)
                                 ->  Hash  (cost=5344.97..5344.97 rows=234997 width=75)
                                       ->  Seq Scan on company_name cn  (cost=0.00..5344.97 rows=234997 width=75)
                           ->  Hash  (cost=61280.28..61280.28 rows=2528228 width=94)
                                 ->  Seq Scan on title t  (cost=0.00..61280.28 rows=2528228 width=94)
 JIT:

如您所见,此计划中不存在条件 mc.movie_id = mk.movie_id。如何以及为什么可能?

解决方法

看最后3个条件:

AND ci.movie_id = mc.movie_id
AND ci.movie_id = mk.movie_id
AND mc.movie_id = mk.movie_id;

使用 movie_id,您将表 cimc 匹配,然后将 cimk 匹配,因此意味着 mc 匹配 {{ 1}},因此最后一个条件是多余的,规划器理所当然地忽略了它。

,

JGH 回答了这个问题。

这不是答案,但这是编写连接的正确方法(INNER 不是绝对必要的,但我更喜欢明确的)。 FROM table1,table2 WHERE ... 是一个你应该立即改掉的坏习惯。这种语法不是很灵活,甚至会使简单的查询几乎无法阅读。

SELECT * 
  FROM aka_name AS an
 INNER
  JOIN NAME AS n
    ON an.person_id = n.id
 INNER
  JOIN cast_info AS ci
    ON an.person_id = ci.person_id
   AND n.id = ci.person_id
 INNER
  JOIN title AS T
    ON ci.movie_id = t.id
 INNER
  JOIN movie_keyword AS mk
    ON t.id = mk.movie_id
   AND ci.movie_id = mk.movie_id
 INNER
  JOIN movie_companies AS mc
    ON t.id = movie_companies.movie_id
   AND ci.movie_id = mc.movie_id
   AND mc.movie_id = mk.movie_id
 INNER
  JOIN keyword AS K
    ON mk.keyword_id = k.id
 INNER
  JOIN company_name cn
    ON mc.mc.company_id = cn.id;

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?