如何解决将运行总和分解为最大组大小/长度
我正在尝试将运行(有序)和分解为最大值的组。当我实现以下示例逻辑时...
IF OBJECT_ID(N'tempdb..#t') IS NOT NULL DROP TABLE #t
SELECT TOP (ABS(CHECKSUM(NewId())) % 1000) ROW_NUMBER() OVER (ORDER BY name) AS ID,LEFT(CAST(NEWID() AS NVARCHAR(100)),ABS(CHECKSUM(NewId())) % 30) AS Description
INTO #t
FROM sys.objects
DECLARE @maxGroupSize INT
SET @maxGroupSize = 100
;WITH t AS (
SELECT
*,LEN(Description) AS DescriptionLength,SUM(LEN(Description)) OVER (/*PARTITION BY N/A */ ORDER BY ID) AS [RunningLength],SUM(LEN(Description)) OVER (/*PARTITION BY N/A */ ORDER BY ID)/@maxGroupSize AS GroupID
FROM #t
)
SELECT *,SUM(DescriptionLength) OVER (PARTITION BY GroupID) AS SumOfGroup
FROM t
ORDER BY GroupID,ID
我得到的组大于最大组大小(长度)100。
解决方法
递归公用表表达式 (rcte
) 将是解决此问题的一种方法。
示例数据
有限的固定样本数据集。
create table data
(
id int,description nvarchar(20)
);
insert into data (id,description) values
( 1,'qmlsdkjfqmsldk'),( 2,'mldskjf'),( 3,'qmsdlfkqjsdm'),( 4,'fmqlsdkfq'),( 5,'qdsfqsdfqq'),( 6,'mds'),( 7,'qmsldfkqsjdmfqlkj'),( 8,'qdmsl'),( 9,'mqlskfjqmlkd'),(10,'qsdqfdddffd');
解决方案
对于每个递归步骤评估 (r.group_running_length + len(d.description) <= @group_max_length
) 是否必须扩展前一组或必须在 case
表达式中启动新组。
将组目标大小设置为 40
以更好地拟合样本数据。
declare @group_max_length int = 40;
with rcte as
(
select d.id,d.description,len(d.description) as description_length,len(d.description) as running_length,1 as group_id,len(d.description) as group_running_length
from data d
where d.id = 1
union all
select d.id,len(d.description),r.running_length + len(d.description),case
when r.group_running_length + len(d.description) <= @group_max_length
then r.group_id
else r.group_id + 1
end,case
when r.group_running_length + len(d.description) <= @group_max_length
then r.group_running_length + len(d.description)
else len(d.description)
end
from rcte r
join data d
on d.id = r.id + 1
)
select r.id,r.description,r.description_length,r.running_length,r.group_id,r.group_running_length,gs.group_sum
from rcte r
cross apply ( select max(r2.group_running_length) as group_sum
from rcte r2
where r2.group_id = r.group_id ) gs -- group sum
order by r.id;
结果
包含运行组长度以及每行的组总和。
id description description_length running_length group_id group_running_length group_sum
-- ---------------- ------------------ -------------- -------- -------------------- ---------
1 qmlsdkjfqmsldk 14 14 1 14 33
2 mldskjf 7 21 1 21 33
3 qmsdlfkqjsdm 12 33 1 33 33
4 fmqlsdkfq 9 42 2 9 39
5 qdsfqsdfqq 10 52 2 19 39
6 mds 3 55 2 22 39
7 qmsldfkqsjdmfqlkj 17 72 2 39 39
8 qdmsl 5 77 3 5 28
9 mqlskfjqmlkd 12 89 3 17 28
10 qsdqfdddffd 11 100 3 28 28
Fiddle 查看实际情况(包括随机数据版本)。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。