查看获取条件复杂的最小日期

如何解决查看获取条件复杂的最小日期

我在 SQL Server 中有一个这样的表:

+----------+-----------+------------+
| DateFrom | Completed | EmployeeID |
+----------+-----------+------------+
DateFrom: date not null -- unique for each EmployeeID
Completed: bit not null
EmployeeID: bigint not null
  • 每一行都属于一个由开始日期定义的子周期,可以完成也可以不完成。
  • 每位员工可以有多个子时段。
  • 一个时期由一系列有序的子时期定义,直到最后一个子时期完成。

我想创建一个视图,该视图将返回每个 EmployeeID 上一期的开始日期,如下所示:

  1. 如果没有 Completed 为 true,则获取最小 DateFrom。 [员工有一个时期尚未完成]
+----------+-----------+------------+
| DateFrom | Completed | EmployeeID |
+----------+-----------+------------+
|2021-01-01|   false   |     1      |
|2021-01-05|   false   |     1      |
|2021-01-09|   false   |     1      |
|2021-01-10|   false   |     1      |
|2021-01-07|   false   |     2      |
|2021-01-15|   false   |     2      |
+----------+-----------+------------+

Expected Result:
2021-01-01 for EmployeeID = 1
2021-01-07 for EmployeeID = 2
  1. 否则,返回最后一个 Completed 为真后的最小 DateFrom。 【最后一期还没有完成】
+----------+-----------+------------+
| DateFrom | Completed | EmployeeID |
+----------+-----------+------------+
|2021-01-01|   false   |     1      |
|2021-01-05|   true    |     1      |
|2021-01-09|   false   |     1      |
|2021-01-10|   false   |     1      |
|2021-01-07|   true    |     2      |
|2021-01-15|   false   |     2      |
+----------+-----------+------------+

Expected Result:
2021-01-09 for EmployeeID = 1
2021-01-15 for EmployeeID = 2
  1. 如果最大 DateFrom 已 Completed=true,则返回最后一个 Completed 为 true 之前和之前的 true 之后的最小 DateFrom(如果存在)。 [最后一期由多个子期完成]
+----------+-----------+------------+
| DateFrom | Completed | EmployeeID |
+----------+-----------+------------+
|2021-01-01|   false   |     1      |
|2021-01-05|   true    |     1      |
|2021-01-09|   false   |     1      |
|2021-01-10|   true    |     1      |
|2021-01-07|   false   |     2      |
|2021-01-15|   true    |     2      |
+----------+-----------+------------+

Expected Result:
2021-01-09 for EmployeeID = 1
2021-01-07 for EmployeeID = 2
  1. 如果最大 DateFrom 的 Completed=true 并且没有其他行或它之前的行 Completed=true,则返回最大 DateFrom。 [最后一期以一个子期完成]
+----------+-----------+------------+
| DateFrom | Completed | EmployeeID |
+----------+-----------+------------+
|2021-01-01|   false   |     1      |
|2021-01-05|   false   |     1      |
|2021-01-09|   true    |     1      |
|2021-01-10|   true    |     1      |
|2021-01-07|   true    |     2      |
+----------+-----------+------------+

Expected Result:
2021-01-10 for EmployeeID = 1
2021-01-07 for EmployeeID = 2

我正在寻找最优化的解决方案。

我试过了,但在第三个例子中我得到了一个 NULL 值:

WITH T AS (
    SELECT EmployeeID,MAX(CASE WHEN Completed = 0 THEN NULL ELSE DateFrom END) MaxDateFrom 
    FROM TableDates
    GROUP BY EmployeeID
)
SELECT TableDates.EmployeeID,MIN(TableDates.DateFrom) DateFrom
FROM T
LEFT JOIN TableDates ON T.EmployeeID = TableDates.EmployeeID
    AND (T.MaxDateFrom IS NULL OR TableDates.DateFrom > T.MaxDateFrom)
GROUP BY TableDates.EmployeeID

解决方法

我认为您只想要条件聚合——带有一堆逻辑。假设您每天都有行,我认为这可以满足您的需求:

select employeeid,(case when -- case 4
                  min(completed) = max(completed) and
                  min(completed) = 'true'
             then max(datefrom) 
             when -- case 1
                  min(completed) = max(completed) and
                  min(completed) = 'false'
             then min(datefrom) 
             when -- case 3
                  max(datefrom) = max(case when completed = 'true' then datefrom end)
             then min(case when completed_seqnum = 1 then datefrom end)
             else dateadd(day,1,max(case when completed = 'true' then datefrom end))
        end)
from (select t.*,sum(case when completed = 'true' then 1 else 0 end) over (partition by employeeid order by datefrom desc) as completed_seqnum
      from t
     ) t
group by employeeid;

每天需要一行实际上只是一种方便——例如,允许代码添加一天以获取特定“真”假之后的日期。这也可以在子查询中使用 lead() 来完成。

注意:这不会处理所有条件(至少对于非 NULL 日期。例如,当数据末尾有一系列“true”时,它返回 NULL。>

如果这是一个问题 - 您的问题的这个版本已被问到。提出一个问题,并提供适当的样本数据和所需的结果。我还认为您可能能够解释您正在尝试解决的问题并简化解释。

编辑:

如果缺少日期,您可以使用:

select employeeid,(case when -- case 4
                  min(completed) = max(completed) and
                  min(completed) = 'true'
             then max(datefrom) 
             when -- case 1
                  min(completed) = max(completed) and
                  min(completed) = 'false'
             then min(datefrom) 
             when -- case 3
                  max(datefrom) = max(case when completed = 'true' then datefrom end)
             then min(case when completed_seqnum = 1 then datefrom end)
             else max(case when completed = 'true' then next_datefrom end)
        end)
from (select t.*,lead(datefrom) over (partition by employeeid order by datefrom) as next_datefrom,sum(case when completed = 'true' then 1 else 0 end) over (partition by employeeid order by datefrom desc) as completed_seqnum
      from t
     ) t
group by employeeid;
,

这是一个有效的查询。它可能过于复杂,但我把简化留给你。

处理这3种情况,都按要求按EmployeeId划分,如下:

  1. 当不存在 Completed=1 时,使用 sum(Completed) over() 检测,然后使用 first_value(DateFrom)

  2. 当最后一行值为 completed=1 且前一行值为 completed=0 时,使用 last_value(Completed)lag(Completed) 检测,然后使用 max(case when Completed = 0 then DateFrom else null end)

  3. 棘手的情况,当 Completed=1 存在并且它不是最后。在这种情况下,找到 Completed=1 的最近行的 DateFrom,然后为比先前检测到的行更近的所有行找到 min(DateFrom),直到前面的 Completed=1。>

  4. 如果最后一行有 completed=1,倒数第二行有 completed=1,则使用最后一行的 DateFrom。如果所有其他选项都为空,则 Coalesce 会确保这一点。

insert into @Test (EmployeeId,DateFrom,Completed)
values
-- Scenario 1
(1,'2021-01-01',0),(1,'2021-01-02','2021-01-03',-- Scenario 2
(2,(2,1),'2021-01-04',-- Scenario 3
(3,(3,-- Special case,single row
(4,-- Scenario 4
(5,(5,1);

with cte as (
  select *
    -- First value of DateFrom over all rows (not the default),first_value (DateFrom) over (partition by EmployeeId order by DateFrom asc rows between unbounded preceding and unbounded following) FirstDateFrom
    -- Last value of Completed over all rows (not the default),last_value (Completed) over (partition by EmployeeId order by DateFrom asc rows between unbounded preceding and unbounded following) LastCompleted
    -- Find the Date of the last row with Completed = 1,max (case when Completed = 1 then DateFrom else null end) over (partition by EmployeeId order by DateFrom asc rows between unbounded preceding and unbounded following) LastCompletedNew
    -- Regular row number,row_number() over (partition by EmployeeId order by DateFrom desc) RowNumber
    -- Total number of rows with Completed = 1,sum(convert(int,Completed)) over (partition by EmployeeId) SumOfCompleted
    -- Max value of DateFrom where Completed = 0,max(case when Completed = 0 then DateFrom else null end) over (partition by EmployeeId order by DateFrom asc rows between unbounded preceding and unbounded following) MaxDateFrom
    -- Check the lagged complete to see if the last 2 rows are completed = 1,lag(Completed) over (partition by EmployeeId order by DateFrom asc) LaggedComplete
    -- Borrowed from Gordon to check which rows are prior to the last Completed = 1 and before the preceding Completed = 1,sum(case when completed = 1 then 1 else 0 end) over (partition by employeeid order by datefrom desc) as completed_seqnum
  from @Test
)
select
  EmployeeId
  -- Use the only DateFrom if there is only one,coalesce(case
    -- Scenario 1
    when SumOfCompleted = 0 then FirstDateFrom
    when LastCompleted = 1 then
      case
      -- Scenario 4
      when coalesce(LaggedComplete,0) = 1 then DateFrom
      -- Scenario 3
      else Scenario3
      end
    -- Scenario 2
    else ActualResult
    end,DateFrom) FinalResult
  --,* -- Uncomment for working
from (
  select *
    -- Find the lowest DateFrom which is greater then the DateFrom of the last row where Completed = 1,min(case when DateFrom > LastCompletedNew then DateFrom else null end) over (partition by EmployeeId) ActualResult
    -- Find the min DateFrom over the rows between the last Completed=1 and the Completed=1 before it (if it exists),min(case when completed_seqnum = 1 then DateFrom else null end) over (partition by EmployeeId order by DateFrom asc rows between unbounded preceding and unbounded following) Scenario3
  from cte
) x
-- Because we have calculated the same result for every row we just take the first
where RowNumber = 1
order by x.EmployeeId asc,x.DateFrom asc;

注意:这里假设每个日期只有一行。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams['font.sans-serif'] = ['SimHei'] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -> systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping("/hires") public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate<String
使用vite构建项目报错 C:\Users\ychen\work>npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)> insert overwrite table dwd_trade_cart_add_inc > select data.id, > data.user_id, > data.course_id, > date_format(
错误1 hive (edu)> insert into huanhuan values(1,'haoge'); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive> show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 <configuration> <property> <name>yarn.nodemanager.res