如何解决从 pyspark 中的 Graphframes 图中找到一个诱导子图
有没有办法从 pySpark 中的 GraphFrame 图中找到具有给定中心节点的诱导子图? ?我曾尝试从模体制作诱导子图,但没有成功。
我尝试使用 NetworkX 的 ego 图,它按预期工作,但对于大型图(1200 万条边),加载整个图需要很长时间。
这里是一个中心节点为'a'的例子
v = sqlc.createDataFrame([
("a","Alice",34),("b","Bob",36),("c","Charlie",30),("d","David",29),("e","Esther",32),("f","Fanny",("g","Gabby",60)
],["id","name","age"])
# Edge DataFrame
e = sqlc.createDataFrame([
("a","b","friend"),"c","f","d","a",("a","e","friend")
],["src","dst","relationship"])
# Create a GraphFrame
g = GraphFrame(v,e)
get_community(g,1)
def create_motif(length: int) -> str:
"""Create a motif string.
Args:
length (int):
"""
motif_path = "(start)-[edge0]->"
for i in range(1,length):
motif_path += "(n%s);(n%s)-[edge%s]->" % (i - 1,i - 1,i)
motif_path += "(end)"
return motif_path
def get_community(G,depth):
motif_path = create_motif(depth)
current_motif = G.find(motif_path)\
current_motif.select(f.col("start.*"),"*").show()
返回:
+---+-----+---+--------------+--------------+---------------+
| id| name|age| start| edge0| end|
+---+-----+---+--------------+--------------+---------------+
| a|Alice| 34|[a,Alice,34]|[a,e,friend]|[e,Esther,32]|
| a|Alice| 34|[a,b,friend]| [b,Bob,36]|
+---+-----+---+--------------+--------------+---------------+
应该返回
+---+-----+---+--------------+--------------+---------------+
| id| name|age| start| edge0| end|
+---+-----+---+--------------+--------------+---------------+
| a|Alice| 34|[a,36]|
| a|Alice| 34|[a,d,friend]| [d,David,29]|
| b| Bob| 36|[b,36]|[b,29]|
+---+-----+---+--------------+--------------+---------------+
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。