如何解决如何避免配置单元查询中的重复
我有两张桌子:
table1
the_date | my_id |
02/03/2021,123
02/03/2021,1234
02/03/2021,12345
table2
the_date | my_id |seq | txt
02/03/2021,1234,1,'OK'
02/03/2021,12345,2,'HELLO HI THERE'
02/03/2021,123456,'Ok'
这是我的代码:
WITH AB AS (
SELECT A1.my_id
FROM DB1.table1 A1,DB1.MSG_REC A2 WHERE
A1.my_id=A2.my_id
),BC AS (
SELECT AB.the_date
COUNT ( DISTINCT (CASE WHEN (TXT like '%OK%') THEN AB.my_id ELSE NULL END )) AS
CASE1,COUNT ( DISTINCT (CASE WHEN (TXT like '%HELLO HI THERE%') THEN AB.my_id ELSE NULL END )) AS
CASE2
FROM AB left JOIN DB1.my_id BC ON AB.my_id =BC.my_id
源于上面的问题是我将值 '12345' 循环两次,因为它满足两个 case 语句。
这会导致在捕获计数指标时出现数据重复。有没有办法先执行第一个案例,然后执行第二个案例,但排除循环第一个案例中的任何“my_id”记录。
例如,当运行上面的脚本并且第一个 case 执行时,它会选择下面的记录并且计数为 3
02/03/2021,'OK'
02/03/2021,'OK'
02/03/2021,'Ok
第二种情况应该只循环遍历以下记录,并且计数仅为 1
02/03/2021,'HELLO HI THERE'
如果我不创造条件来规避这个问题,CASE1 将是 4,CASE2 将是 2。有什么提示或建议吗?
解决方法
在 DISTINCT 聚合之前为每个 ID 分配大小写。之后进行不同的聚合,这样您将消除在不同情况下计数的相同 ID。查看代码中的注释:
select --do final distinct aggregation
count(distinct (case when assigned_case='CASE1' then my_id else null end ) ) as CASE1,count(distinct (case when assigned_case='CASE2' then my_id else null end ) ) as CASE2
from
(
select my_id
--assign single CASE to all rows with the same id based on some logic:
case when case1_flag = 1 then 'CASE1'
when case1_flag = 1 then 'CASE2'
else NULL
end as assigned_case
from
(--calculate all CASE flags for each ID
select AB.my_id,max(CASE WHEN (TXT like '%OK%') THEN 1 ELSE NULL END) over (partition by AB.my_id) as case1_flag
max(CASE WHEN (TXT like '%HELLO HI THERE%') THEN 1 ELSE NULL END) over (partition by AB.my_id) as case2_flag
from ...
) s
) s
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。