用聚合填充 MongoDB 中二维时间序列数据的天空白

如何解决用聚合填充 MongoDB 中二维时间序列数据的天空白

我有一个二维时间序列数据的集合如下:

[
  { 
    "value" : 9,"timestamp" : "2020-12-30T02:06:33.000+0000","recipeId" : 15
  },{ 
    "value" : 2,"timestamp" : "2020-12-30T12:04:23.000+0000","recipeId" : 102
  },{ 
    "value" : 5,"timestamp" : "2020-12-30T15:09:23.000+0000",...
]

记录有一个 recipeId,这是我正在寻找的第一级分组。应该总结一天食谱的所有 values。我想要每个 recipeId 的时间序列数组。我需要用 0 填补缺失的日子。我希望为提供的开始和结束日期范围创建此构造。

对于 2020-12-292020-12-31 的日期范围,有些人喜欢这样:

[
 [
  { 
    "sum" : 0,"timestamp" : "2020-12-29",{ 
    "sum" : 9,"timestamp" : "2020-12-30",{ 
    "sum" : 0,"timestamp" : "2020-12-31",...
 ],[
  { 
    "sum" : 0,"recipeId" : 0
  },{ 
    "sum" : 7,...
 ]
]

这是我目前所拥有的,它只能部分解决我的要求。我无法正确完成最后几个阶段:

[
  {
    "$match": {
      "timestamp": {
        "$gte": "2020-12-29T00:00:00.000Z","$lte": "2020-12-31T00:00:00.000Z"
      }
    }
  },{
    "$addFields": {
      "timestamp": {
        "$dateFromParts": {
          "year": { "$year": "$timestamp" },"month": { "$month": "$timestamp" },"day": { "$dayOfMonth": "$timestamp" }
        }
      },"dateRange": {
        "$map": {
          "input": {
            "$range": [
              0,{
                "$trunc": {
                  "$divide": [
                    {
                      "$subtract": [
                        "2020-12-31T00:00:00.000Z","2020-12-29T00:00:00.000Z"
                      ]
                    },1000
                  ]
                }
              },86400
            ]
          },"in": {
            "$add": [
              "2020-12-29T00:00:00.000Z",{ "$multiply": ["$$this",1000] }
            ]
          }
        }
      }
    }
  },{ "$unwind": "$dateRange" },{
    "$group": {
      "_id": { "date": "$dateRange","recipeId": "$recipeId" },"count": {
        "$sum": { "$cond": [{ "$eq": ["$dateRange","$timestamp"] },1,0] }
      }
    }
  },{
    "$group": {
      "_id": "$_id.date","total": { "$sum": "$count" },"byRecipeId": {
        "$push": {
          "k": { "$toString": "$_id.recipeId" },"v": { "$sum": "$count" }
        }
      }
    }
  },{ "$sort": { "_id": 1 } },{
    "$project": {
      "_id": 0,"timestamp": "$_id","total": "$total","byRecipeId": {
        "$arrayToObject": {
          "$filter": { "input": "$byRecipeId","cond": "$$this.v" }
        }
      }
    }
  }
]

导致:

[
    {
        "timestamp": "2020-12-29T00:00:00.000Z","total": 21,"byRecipeId": {}
    },{
        "timestamp": "2020-12-30T00:00:00.000Z","total": 0,"byRecipeId": {
            "15": 9,"102": 7
        }
    },{
        "timestamp": "2020-12-31T00:00:00.000Z","byRecipeId": {}
    }
]

我当然愿意接受替代解决方案。例如,我看到了这篇文章:https://medium.com/@alexandro.ramr777/fill-missing-values-using-mongodb-aggregation-framework-f011114e83e0 但它不涉及多维。

解决方法

您可以使用 $redcue 函数。此代码填补了当天的分钟数。应该很容易调整它以提供缺失的日子。

{
   $addFields: {
      data: {
         $reduce: {
            input: { $range: [0,24 * 60] },initialValue: [],in: {
               $let: {
                  vars: { 
                     ts: { 
                        $add: [
                           moment().startOf('day').toDate(),{ $multiply: ["$$this",1000 * 60] }
                        ] 
                     } 
                  },in: {
                     $concatArrays: [
                        "$$value",[{
                           $cond: {
                              if: { $in: ["$$ts","$data.timestamp"] },then: { 
                                 $first: { 
                                    $filter: { 
                                       input: "$data",cond: { $eq: ["$$this.timestamp","$$ts"] } 
                                    } 
                                 } 
                              },else: { timestamp: "$$ts",total: 0 }
                           }
                        }]
                     ]
                  }
               }
            }
         }
      }
   }
}

在我看来,$reduce$map 更优雅,但根据我的经验,$reduce 的性能要差得多。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams['font.sans-serif'] = ['SimHei'] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -> systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping("/hires") public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate<String
使用vite构建项目报错 C:\Users\ychen\work>npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)> insert overwrite table dwd_trade_cart_add_inc > select data.id, > data.user_id, > data.course_id, > date_format(
错误1 hive (edu)> insert into huanhuan values(1,'haoge'); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive> show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 <configuration> <property> <name>yarn.nodemanager.res