微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

MongoDB 时间空间查询使用 2dsphere 索引较慢

如何解决MongoDB 时间空间查询使用 2dsphere 索引较慢

最近,我开始研究 MongoDB 与 AIS Data 的性能。 我使用了一个包含 19m 文档的集合,其中包含定义中描述的适当字段类型。 我还在同一个集合中创建了一个新的 geoloc 字段,其类型为:(点)来自坐标(lon,lat)。

正在调查的查询是:

db.nari_dynamic.explain('executionStats').aggregate
(
[
  {
      "$match": {
           "geoloc": {
               "$geoWithin": {
                   "$geometry": {
                       "type" : "polygon","coordinates": [ [ [ -5.00,45.00 ],[ +0.00,50.00 ],[ -5.00,45.00 ] ] ]
              }}}}
  },{ "$group": {"_id": "$sourcemmsi","PointCount": {"$sum" : 1},"MinDatePoint": {"$min" : {"date": "$t3" }},"MaxDatePoint": {"$max" : {"date": "$t3" }} }},{ "$sort": {"_id":1} },{ "$limit":100 },{ "$project": {"_id":1,"PointCount":1,"MinDatePoint":1,"MaxDatePoint":1} }
],{ explain:true}
)

在调查和测试期间,我发现了以下内容

  1. 没有任何索引:94s
  2. 使用 geoloc-2dsphere 索引:280s

以下是执行统计数据: 没有索引

{ stages: 
   [ { '$cursor': 
        { queryPlanner: 
           { plannerVersion: 1,namespace: 'mscdata.nari_dynamic',indexFilterSet: false,parsedQuery: 
              { geoloc: 
                 { '$geoWithin': 
                    { '$geometry': 
                       { type: 'polygon',coordinates: [ [ [ -5,45 ],[ 0,50 ],[ -5,45 ] ] ] } } } },queryHash: '6E2EAB94',planCacheKey: '6E2EAB94',winningPlan: 
              { stage: 'PROJECTION_SIMPLE',transformBy: { sourcemmsi: 1,t3: 1,_id: 0 },inputStage: 
                 { stage: 'COLLSCAN',filter: 
                    { geoloc: 
                       { '$geoWithin': 
                          { '$geometry': 
                             { type: 'polygon',direction: 'forward' } },rejectedplans: [] } } },{ '$group': 
        { _id: '$sourcemmsi',PointCount: { '$sum': { '$const': 1 } },MinDatePoint: { '$min': { date: '$t3' } },MaxDatePoint: { '$max': { date: '$t3' } } } },{ '$sort': { sortKey: { _id: 1 },limit: 100 } },{ '$project': 
        { _id: true,PointCount: true,MaxDatePoint: true,MinDatePoint: true } } ],serverInfo: 
   { host: 'ubuntu16',port: 27017,version: '4.4.1',gitVersion: 'ad91a93a5a31e175f5cbf8c69561e788bbc55ce1' },ok: 1 }

这是执行统计数据:使用索引

{ stages: 
   [ { '$cursor': 
        { queryPlanner: 
           { plannerVersion: 1,planCacheKey: 'F35B194B',inputStage: 
                 { stage: 'FETCH',inputStage: 
                    { stage: 'IXSCAN',keyPattern: { geoloc: '2dsphere' },indexName: 'geoloc-field',ismultikey: false,multikeyPaths: { geoloc: [] },isUnique: false,issparse: false,isPartial: false,indexVersion: 2,direction: 'forward',indexBounds: 
                       { geoloc: 
                          [ '[936748722493063168,936748722493063168]','[954763121002545152,954763121002545152]','[959266720629915648,959266720629915648]','[960392620536758272,960392620536758272]','[960674095513468928,960674095513468928]','[960744464257646592,960744464257646592]','[960762056443691008,960762056443691008]','[960766454490202112,960766454490202112]','[960767554001829888,960767554001829888]','[960767828879736832,960767828879736832]','[960767897599213568,960767897599213568]','[960767914779082752,960767914779082752]','[960767919074050048,960767919074050048]','[960767920147791872,960767920147791872]','[960767920416227328,960767920416227328]','[960767920483336192,960767920483336192]','[960767920500113408,960767920500113408]','[960767920504307712,960767920504307712]','[960767920505356288,960767920505356288]','[960767920505618432,960767920505618432]','[960767920505683968,960767920505683968]','[960767920505683969,960767920505716735]','[1345075088707977217,1345075088708009983]','[1345075088708009984,1345075088708009984]','[1345075088708075520,1345075088708075520]','[1345075088708337664,1345075088708337664]','[1345075088709386240,1345075088709386240]','[1345075088713580544,1345075088713580544]','[1345075088730357760,1345075088730357760]','[1345075088797466624,1345075088797466624]','[1345075089065902080,1345075089065902080]','[1345075090139643904,1345075090139643904]','[1345075094434611200,1345075094434611200]','[1345075111614480384,1345075111614480384]','[1345075180333957120,1345075180333957120]','[1345075455211864064,1345075455211864064]','[1345076554723491840,1345076554723491840]','[1345080952770002944,1345080952770002944]','[1345098544956047360,1345098544956047360]','[1345168913700225024,1345168913700225024]','[1345450388676935680,1345450388676935680]','[1346576288583778304,1346576288583778304]','[1351079888211148800,1351079888211148800]','[1369094286720630784,1369094286720630784]','[5116089176692883456,5116089176692883456]','[5170132372221329408,5170132372221329408]','[5179139571476070401,5179702521429491711]','[5179702521429491713,5180265471382913023]','[5180265471382913024,5180265471382913024]','[5183643171103440896,5183643171103440896]','[5187020870823968768,5187020870823968768]','[5187020870823968769,5187583820777390079]','[5187583820777390081,5188146770730811391]','[5188146770730811393,5197153969985552383]','[5206161169240293376,5206161169240293376]','[5218264593238851584,5218264593238851584]','[5218264593238851585,5218405330727206911]','[5218546068215562240,5218546068215562240]','[5218546068215562241,5219109018168983551]','[5219671968122404864,5219671968122404864]','[5220234918075826177,5220797868029247487]','[5220797868029247488,5220797868029247488]','[5220938605517602817,5221079343005958143]','[5221079343005958144,5221079343005958144]','[5260204364768739328,5260204364768739328]' ] } } } },MinDatePoint: true,PointCount: true } } ],ok: 1 }

当然,我知道这会更复杂,因为查询具有分组功能,但我们的想法是,通常,我们会使用索引更快而不是更慢,除非索引导致引擎内部的排序与 geoNear 不同

此外,如果查询和索引改进如何对查询产生影响,MongoDB 有一个完整的分析,但对于 geoWithin 的信息没有那么多。 MongoDB 声明结果没有用 GeoWithin 排序,所以我没有找到延迟的原因。 https://www.mongodb.com/blog/post/geospatial-performance-improvements-in-mongodb-3-2

有什么想法或意见,为什么带索引的查询比较慢?

解决方法

经过大量调查,似乎一旦查询请求超过 70% 的数据集,在这种情况下,有索引的 95% 比没有索引要慢。

这种情况也存在于地理空间以外的其他索引中,例如数字或描述性列(ship_name、ship_number 或时间戳)中的简单索引。

发生这种情况是因为 RDBMS 必须搜索索引的键和文档的键,这导致执行时间更长。

另一方面,这不应该发生,因为 Mongo-Planner 应该能够解决这个问题,而不是提供进一步使用的索引,从而保持对键的低访问。

该问题在 MongoDB 支持中打开,可在此处找到:

https://jira.mongodb.org/browse/SERVER-53709

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。