如何解决PostgresSQL 中的正则表达式连接查询优化
如何优化下面的查询,它took 8 hrs to run
:
create table rtime.rtime_calc1_jun13tojun19 as(
explain select pa.api as pa_api,pa.action_type as pa_action_type,max(rt.request_time),avg(rt.request_time),percentile_cont(0.95) within group (order by rt.request_time asc) as percentile_95
from public.public_api pa,(select reqtime.* from public.public_api puba
right join rtime.rtime_data1_jun13tojun19 reqtime
on puba.api = reqtime.proxy
where puba.api is null) as rt -- to join only regex patterns,and to prevent exact static matches from becoming a part of regex join
where rt.proxy ~* pa.api_regex
and rt.method = pa.action_type
group by pa.api,pa.action_type
)
下面是解释计划:
GroupAggregate (cost=1131.43..263846.61 rows=1 width=70)
Group Key: pa.api,pa.action_type
-> nested Loop (cost=1131.43..263846.59 rows=1 width=54)
Join Filter: (((reqtime.proxy)::text ~* (pa.api_regex)::text) AND ((reqtime.method)::text = (pa.action_type)::text))
-> Index Scan using primary_key_pa on public_api pa (cost=0.28..565.81 rows=2007 width=90)
-> Materialize (cost=1131.16..263245.66 rows=1 width=49)
-> Gather (cost=1131.16..263245.65 rows=1 width=49)
Workers Planned: 2
-> Hash Anti Join (cost=131.16..262245.55 rows=1 width=49)
Hash Cond: ((reqtime.proxy)::text = (puba.api)::text)
-> Parallel Seq Scan on rtime_data1_jun13tojun19 reqtime (cost=0.00..218885.08 rows=5763908 width=49)
-> Hash (cost=106.07..106.07 rows=2007 width=42)
-> Seq Scan on public_api puba (cost=0.00..106.07 rows=2007 width=42)
public.public_api
表有 2007
行。 rtime.rtime_data1_jun13tojun19
表中有 13837305 rows
。
这是 public_api
表的 DDL:
CREATE TABLE public.public_api (
api varchar NOT NULL,"type" varchar NULL,api_bin varchar NULL,api_bin_avg_resp_time varchar NULL,api_bin_perc95_resp_time varchar NULL,max_response_time float8 NULL,avg_response_time float8 NULL,percentile_95_response_time float8 NULL,max_tps int4 NULL,min_tps int4 NULL,avg_tps float8 NULL,percentile_90_tps float8 NULL,percentile_99_tps float8 NULL,percentile_95_tps float8 NULL,product varchar NULL,action_type varchar NOT NULL,proxy varchar NULL,CONSTRAINT primary_key_pa PRIMARY KEY (api,action_type)
);
这是 rtime.rtime_data1_jun13tojun19
的 DDL
CREATE TABLE rtime.rtime_data1_jun13tojun19 (
env varchar NULL,"method" varchar NULL,request_time float8 NULL
);
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。