微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

胶水和Python

如何解决胶水和Python

我试图使用需要安装gremlinpython库的Python从AWS Neptune数据库提取数据。在将作业设置为pythonshell作业后可以工作。但是,无论如何,我可以使用Pyspark Job来使用功能吗?

import boto3
import os
import sys
import site
import json
from setuptools.command import easy_install
from importlib import reload

s3 = boto3.client('s3')
dir_path = os.path.dirname(os.path.realpath(__file__))
#os.path.dirname(sys.modules['__main__'].__file__)

install_path = os.environ['gluE_INSTALLATION']
easy_install.main( ["--install-dir",install_path,"gremlinpython"] )

reload(site)

from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection

graph = Graph()

remoteConn = DriverRemoteConnection('wss://neptune-test-data-read-1.uiokd9wka3.us-central-1.neptune.amazonaws.com:8182/gremlin','g')
g = graph.traversal().withRemote(remoteConn)

edges = g.E().toList()

ed = []
for e in edges:
    ed.append(str(e))
    
vertices = g.V().valueMap(True).toList()
ve = []
for v in vertices:
    vee = []
    for vt in v:
        if isinstance(v[vt],list):
            vee.append(v[vt][0])
        else:
            vee.append(v[vt])
    ve.append(vee)
        

remoteConn.close()

s3 = boto3.resource('s3')
object = s3.Object('s3bucket','target/edges.txt')
object.put(Body=(bytes(json.dumps(ed,indent=2).encode('UTF-8'))))

s3 = boto3.resource('s3')
object = s3.Object('s3bucket','target/vertices.txt')
object.put(Body=(bytes(json.dumps(ve,indent=2).encode('UTF-8'))))

谢谢

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。