为什么MySQL查询要花很长时间才能执行？过滤的意思是什么？

如何解决为什么MySQL查询要花很长时间才能执行？过滤的意思是什么？

我有一个查询，它执行时间很长，过滤后的内容对解释计划意味着什么。

下面是MySQL查询，解释表的计划和结构，版本是MysqL V8.0

SELECT  `responses`.* 
FROM `responses` 
WHERE `responses`.`survey_id` = 196690 AND (responses.time >=  '2017-01-01 08:00:00') AND (responses.time <=  '2020-10-13 13:00:58') 
ORDER BY `responses`.`id` ASC 
LIMIT 500 OFFSET 0;

CREATE TABLE `responses` (
  `id` int(10) NOT NULL AUTO_INCREMENT,`survey_id` int(10) NOT NULL,`token` varchar(255) CHaraCTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',`time` datetime NOT NULL,`ip_address` int(15) unsigned NOT NULL,`identity` varchar(255) CHaraCTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,`user_agent` varchar(255) CHaraCTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL,`completed` tinyint(1) DEFAULT NULL,`completed_time` datetime DEFAULT NULL,`referrer` tinytext CHaraCTER SET utf8mb4 COLLATE utf8mb4_general_ci,`page` tinytext CHaraCTER SET utf8mb4 COLLATE utf8mb4_general_ci,`visible` tinyint(1) DEFAULT '0',`mail_sent` tinyint(1) DEFAULT '0',`anonuuid` varchar(255) CHaraCTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',`Metadata` json DEFAULT NULL,PRIMARY KEY (`id`),KEY `survey_id` (`survey_id`),KEY `time` (`time`),KEY `index_responses_on_survey_id_and_time` (`survey_id`,`time`),KEY `survey_id_2` (`survey_id`,`token`),KEY `survey_id_3` (`survey_id`,`mail_sent`)
)ENGINE=InnoDB AUTO_INCREMENT=204788658 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=DYNAMIC

explain 
SELECT `responses`.* 
FROM `responses` 
WHERE `responses`.`survey_id` = 196690 AND (responses.time >= '2017-01-01 08:00:00') AND (responses.time <= '2020-10-19 13:01:00') 
ORDER BY `responses`.`id` ASC 
LIMIT 500 OFFSET 0\G

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: responses
   partitions: NULL
         type: index
possible_keys: survey_id,time,index_responses_on_survey_id_and_time,survey_id_2,survey_id_3
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 144663
     filtered: 0.19
        Extra: Using where
1 row in set,1 warning (0.00 sec)

解决方法

将TINYTEXT替换为VARCHAR(...)。 TEXT列虽然与VARCHAR本质上相同，但有一些细微的额外开销。
将VARCHAR大小缩小为“合理”大小。
同时拥有INDEX(survey_id)和INDEX(survey_id,...)时，请摆脱前者。不仅没有必要，而且当后者更好时，Optimizer有时会选择后者。
将ORDER BY id更改为ORDER BY time（如果它不会与期望的结果混淆太多）。可能优化器认为，与使用ORDER BY id相比，INDEX(survey_id,time)是运行查询的更好方法。对ORDER BY的更改将防止这种情况。我怀疑大多数表格都包含在该时间范围内，这使优化器感到困惑。

这是一个更大的变化，可能会加快速度：

  PRIMARY KEY (`id`),KEY `survey_id` (`survey_id`),KEY `time` (`time`),KEY `index_responses_on_survey_id_and_time` (`survey_id`,`time`),KEY `survey_id_2` (`survey_id`,`token`),KEY `survey_id_3` (`survey_id`,`mail_sent`)

  PRIMARY KEY (`survey_id`,`time`,`id`),-- 'cluster' primarily on survey_id
  KEY(id),-- to keep AUTO_INCREMENT happy
  KEY `time` (`time`),

从本质上讲，这将迫使优化器使用最佳索引，并且避免在索引的BTree和数据的BTree之间反弹500次。

（也如上所述更改ORDER BY）

我不知道是否值得保留这两个索引：(survey_id,token)和(survey_id,mail_sent)。

“已过滤”是将保留多少行的粗略估计。我很少发现它有什么用。

我看到您正在使用OFFSET。这是否意味着您将“分页”？如果是这样，还有其他问题需要讨论。