微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何优化高尺寸图的neo4j密码查询?

如何解决如何优化高尺寸图的neo4j密码查询?

我编写此查询是为了查找两个节点之间的可能路径。但是,当我尝试使用超过 3 个步骤时,它无法完成工作。我使用的图表包含超过 400 万个节点和 4900 万个关系。

match (src:T047 {CUI:"C0030920"}),(trg:T059 {CUI:"C1294944"}),p = (src)-[*..3]-(trg)
where 
      all(relI in relationships(p) 
      where type(relI) in ["RO","CHD","PAR","RB","RL","RO","SIB","RU","SY"])
and
      all(nodeI in nodes(p)
      where labels(nodeI) in ["T004","T005","T007","T016","T017","T018","T019","T020","T021","T022","T023","T024","T025","T026","T028","T029","T030","T031","T032","T033","T034","T037","T038","T039","T040","T041","T042","T043","T045","T046","T047","T048","T049","T053","T054","T055","T056","T057","T059","T060","T061","T074","T080","T081","T098","T099","T100","T101","T103","T109","T114","T116","T121","T123","T125","T126","T127","T129","T131","T168","T184","T190","T191","T195","T196","T197","T200","T201"])
return p

这是此查询的计划: https://imgur.com/PpWePOz

是否有任何可能的方法来优化此查询或至少找到估计时间?

解决方法

首先,您的查询计划显示您没有使用索引,因此它对 :T059 节点使用 NodeByLabelScan 并在所有节点上运行过滤器以查找具有相关属性的那些。 src 节点也不使用索引查找,而是针对标签和属性过滤可变长度扩展的结果。

您将需要这些索引来帮助提高性能。 :T047(CUI):T059(CUI) 上的索引是您在此处需要的索引。确保你先拥有这个。

此外,为了强制索引查找(与 var-length-expand 和过滤器相反,后者会更昂贵),您可以向规划器提供索引提示。

我们还可以调整路径中节点标签的列表谓词,以便它们在扩展期间而不是之后被过滤。

WITH ["T004","T005","T007","T016","T017","T018","T019","T020","T021","T022","T023","T024","T025","T026","T028","T029","T030","T031","T032","T033","T034","T037","T038","T039","T040","T041","T042","T043","T045","T046","T047","T048","T049","T053","T054","T055","T056","T057","T059","T060","T061","T074","T080","T081","T098","T099","T100","T101","T103","T109","T114","T116","T121","T123","T125","T126","T127","T129","T131","T168","T184","T190","T191","T195","T196","T197","T200","T201"] as allowedLabels
MATCH (src:T047 {CUI:"C0030920"}),(trg:T059 {CUI:"C1294944"})
USING INDEX src:T047(CUI)
USING INDEX trg:T059(CUI)
MATCH p = (src)-[*..3]-(trg)
WHERE 
      all(relI in relationships(p) WHERE type(relI) in ["RO","CHD","PAR","RB","RL","RO","SIB","RU","SY"])
    AND all(node IN nodes(p) WHERE labels(node)[0] IN allowedLabels)
RETURN p

这也假设这里的所有节点只有一个可能的标签,而不是多个。如果它们可以有多个标签,那么我们可能需要重构查询。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。