如何解决用聚合填充 MongoDB 中二维时间序列数据的天空白
我有一个二维时间序列数据的集合如下:
[
{
"value" : 9,"timestamp" : "2020-12-30T02:06:33.000+0000","recipeId" : 15
},{
"value" : 2,"timestamp" : "2020-12-30T12:04:23.000+0000","recipeId" : 102
},{
"value" : 5,"timestamp" : "2020-12-30T15:09:23.000+0000",...
]
记录有一个 recipeId
,这是我正在寻找的第一级分组。应该总结一天食谱的所有 values
。我想要每个 recipeId
的时间序列数组。我需要用 0
填补缺失的日子。我希望为提供的开始和结束日期范围创建此构造。
对于 2020-12-29
到 2020-12-31
的日期范围,有些人喜欢这样:
[
[
{
"sum" : 0,"timestamp" : "2020-12-29",{
"sum" : 9,"timestamp" : "2020-12-30",{
"sum" : 0,"timestamp" : "2020-12-31",...
],[
{
"sum" : 0,"recipeId" : 0
},{
"sum" : 7,...
]
]
这是我目前所拥有的,它只能部分解决我的要求。我无法正确完成最后几个阶段:
[
{
"$match": {
"timestamp": {
"$gte": "2020-12-29T00:00:00.000Z","$lte": "2020-12-31T00:00:00.000Z"
}
}
},{
"$addFields": {
"timestamp": {
"$dateFromParts": {
"year": { "$year": "$timestamp" },"month": { "$month": "$timestamp" },"day": { "$dayOfMonth": "$timestamp" }
}
},"dateRange": {
"$map": {
"input": {
"$range": [
0,{
"$trunc": {
"$divide": [
{
"$subtract": [
"2020-12-31T00:00:00.000Z","2020-12-29T00:00:00.000Z"
]
},1000
]
}
},86400
]
},"in": {
"$add": [
"2020-12-29T00:00:00.000Z",{ "$multiply": ["$$this",1000] }
]
}
}
}
}
},{ "$unwind": "$dateRange" },{
"$group": {
"_id": { "date": "$dateRange","recipeId": "$recipeId" },"count": {
"$sum": { "$cond": [{ "$eq": ["$dateRange","$timestamp"] },1,0] }
}
}
},{
"$group": {
"_id": "$_id.date","total": { "$sum": "$count" },"byRecipeId": {
"$push": {
"k": { "$toString": "$_id.recipeId" },"v": { "$sum": "$count" }
}
}
}
},{ "$sort": { "_id": 1 } },{
"$project": {
"_id": 0,"timestamp": "$_id","total": "$total","byRecipeId": {
"$arrayToObject": {
"$filter": { "input": "$byRecipeId","cond": "$$this.v" }
}
}
}
}
]
导致:
[
{
"timestamp": "2020-12-29T00:00:00.000Z","total": 21,"byRecipeId": {}
},{
"timestamp": "2020-12-30T00:00:00.000Z","total": 0,"byRecipeId": {
"15": 9,"102": 7
}
},{
"timestamp": "2020-12-31T00:00:00.000Z","byRecipeId": {}
}
]
我当然愿意接受替代解决方案。例如,我看到了这篇文章:https://medium.com/@alexandro.ramr777/fill-missing-values-using-mongodb-aggregation-framework-f011114e83e0 但它不涉及多维。
解决方法
您可以使用 $redcue 函数。此代码填补了当天的分钟数。应该很容易调整它以提供缺失的日子。
{
$addFields: {
data: {
$reduce: {
input: { $range: [0,24 * 60] },initialValue: [],in: {
$let: {
vars: {
ts: {
$add: [
moment().startOf('day').toDate(),{ $multiply: ["$$this",1000 * 60] }
]
}
},in: {
$concatArrays: [
"$$value",[{
$cond: {
if: { $in: ["$$ts","$data.timestamp"] },then: {
$first: {
$filter: {
input: "$data",cond: { $eq: ["$$this.timestamp","$$ts"] }
}
}
},else: { timestamp: "$$ts",total: 0 }
}
}]
]
}
}
}
}
}
}
}
在我看来,$reduce
比 $map
更优雅,但根据我的经验,$reduce
的性能要差得多。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。