在ORACLE 中,我们常常推荐使用exists 来替代in,往往也能取得比较好的优化效果。在ORACLE应用迁往MYSQL的过程中,我们发现部分in 的子查询语句带到MysqL中,其执行效率变得非常低下,这很让人觉得匪夷所思。于是,我分析了一波。
对两个表,分别是一大一小进行关联查询:
MysqL> select count(*) from users; +----------+ | count(*) | +----------+ | 19 | +----------+ 1 row in set MysqL> select count(*) from orders; +----------+ | count(*) | +----------+ | 86310 | +----------+ 1 row in set MysqL>
开启profile,发现无论是子查询是大表还是小表 ,exists的语句总是比in执行慢:
MysqL> show profiles; +----------+------------+--------------------------------------------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+--------------------------------------------------------------------------------------------+ | 1 | 1.08661625 | select count(1) from orders o where o.user_id in(select u.id from users u) | | 2 | 1.56956275 | select count(1) from orders o where exists (select 1 from users u where u.id = o.user_id) | | 3 | 0.81266425 | select count(1) from users u where u.id in(select o.user_id from orders o) | | 4 | 8.4164905 | select count(1) from users u where exists (select 1 from orders o where u.id = o.user_id) | +----------+------------+--------------------------------------------------------------------------------------------+ 4 rows in set
而查看exists语句的profile内容,发现其存在多个executing 和sending data过程,这是整个sql执行的主要耗时过程:
MysqL> show profile for query 2 ; +----------------------------+----------+ | Status | Duration | +----------------------------+----------+ | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.5E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 1E-6 | | Sending data | 1.3E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.7E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.1E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 1E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.1E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.2E-5 | | executing | 2E-6 | | Sending data | 1.5E-5 | | end | 3E-6 | | query end | 3E-6 | | waiting for handler commit | 1E-5 | | closing tables | 9E-6 | | freeing items | 0.000152 | | cleaning up | 1.7E-5 | +----------------------------+----------+ 100 rows in set
而在in 子查询中,sending data这个过程只有一次,这也是整个sql执行主要耗时地方:
MysqL> show profile for query 1; +--------------------------------+----------+ | Status | Duration | +--------------------------------+----------+ | starting | 9.3E-5 | | Executing hook on transaction | 6E-6 | | starting | 8E-6 | | checking permissions | 5E-6 | | checking permissions | 4E-6 | | opening tables | 0.004849 | | init | 1.8E-5 | | System lock | 1.4E-5 | | optimizing | 1.4E-5 | | statistics | 3.1E-5 | | preparing | 2.2E-5 | | executing | 3E-6 | | Sending data | 1.081273 | 《《《《《《《《《《《《《《《《《《《 | end | 1.3E-5 | | query end | 3E-6 | | waiting for handler commit | 1E-5 | | closing tables | 5.2E-5 | | freeing items | 0.000171 | | cleaning up | 2.9E-5 | +--------------------------------+----------+
关于sending data和executing 解析:
Sending data
The thread is reading and processing rows for a SELECT statement, and sending data to the client. Because operations occurring during this state tend to perform large amounts of disk access (reads), it is often the longest-running state over the lifetime of a given query.
说Sending data 是:线程正在为一个select语句读取和处理行,并且发送数据到客户端。因为这期间操作倾向于大量的磁盘访问(读取),所以这常是整个查询周期中运行时间最长的阶段。(这是MysqL 5.5的解释,5.7的解释完全一样,但是5.7多了一个Sending to client状态)
这样就清楚了,Sending data 做了 读取,处理(过滤,排序等。。)和发送 三件事情,接下来再看该状态下的cpu 和 io 信息 以分析语句的瓶颈是 读取还是处理 ,再做相应的优化调整。。
executing
The thread has begun executing a statement.
在exists语句中不断地executing和Sending data 应该是在不断地扫描抓取数据进行匹配,那这应该与MysqL 的算法有关,通过trace查看其实是在多次子查询 join_execution:
{ "subselect_execution": { "select#": 2, "steps": [ { "join_execution": { "select#": 2, "steps": [ ] /* steps */ } /* join_execution */ } ] /* steps */ } /* subselect_execution */ },
想知道这是鸡肋吗,不知有没有相关的解析?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。