如何解决使用python和BAPI编写的代码的优化
我这里有一个python代码,该代码使用BAPI RFC_READ_TABLE进入SAP,查询USR02表并返回结果。输入来自excel表格A列,输出粘贴到B列 代码运行正常。但是,对于1000条记录,大约需要8分钟才能运行。 您能帮忙优化代码吗?我真的是python新手,设法编写了如此繁重的代码,但是现在停留在优化部分。
如果最多可以运行1-2分钟,那就太好了。
from pyrfc import Connection,ABAPApplicationError,ABAPRuntimeError,LogonError,CommunicationError
from configparser import ConfigParser
from pprint import PrettyPrinter
import openpyxl
ASHOST='***'
CLIENT='***'
SYSNR='***'
USER='***'
PASSWD='***'
conn = Connection(ashost=ASHOST,sysnr=SYSNR,client=CLIENT,user=USER,passwd=PASSWD)
try:
wb = openpyxl.load_workbook('new2.xlsx')
ws = wb['Sheet1']
for i in range(1,len(ws['A'])+1):
x = ws['A'+ str(i)].value
options = [{ 'TEXT': "BNAME = '" +x+"'"}]
fields = [{'FIELDNAME': 'CLASS'},{'FIELDNAME':'USTYP'}]
pp = PrettyPrinter(indent=4)
ROWS_AT_A_TIME = 10
rowskips = 0
while True:
result = conn.call('RFC_READ_TABLE',\
QUERY_TABLE = 'USR02',\
OPTIONS = options,\
FIELDS = fields,\
ROWSKIPS = rowskips,ROWCOUNT = ROWS_AT_A_TIME)
rowskips += ROWS_AT_A_TIME
if len(result['DATA']) < ROWS_AT_A_TIME:
break
data_result = result['DATA']
length_result = len(data_result)
for line in range(0,length_result):
a= data_result[line]["WA"].strip()
wb = openpyxl.load_workbook('new2.xlsx')
ws = wb['Sheet1']
ws['B'+str(i)].value = a
wb.save('new2.xlsx')
except CommunicationError:
print("Could not connect to server.")
raise
except LogonError:
print("Could not log in. Wrong credentials?")
raise
except (ABAPApplicationError,ABAPRuntimeError):
print("An error occurred.")
raise
编辑: 所以这是我的更新代码。现在,我决定仅在命令行上输出数据。输出显示时间在哪里。
try:
output_list = []
wb = openpyxl.load_workbook('new3.xlsx')
ws = wb['Sheet1']
col = ws['A']
col_lis = [col[x].value for x in range(len(col))]
length = len(col_lis)
for i in range(length):
print("--- %s seconds Start of the loop ---" % (time.time() - start_time))
x = col_lis[i]
options = [{ 'TEXT': "BNAME = '" + x +"'"}]
fields = [{'FIELDNAME': 'CLASS'},{'FIELDNAME':'USTYP'}]
ROWS_AT_A_TIME = 10
rowskips = 0
while True:
result = conn.call('RFC_READ_TABLE',QUERY_TABLE = 'USR02',OPTIONS = options,FIELDS = fields,ROWSKIPS = rowskips,ROWCOUNT = ROWS_AT_A_TIME)
rowskips += ROWS_AT_A_TIME
if len(result['DATA']) < ROWS_AT_A_TIME:
break
print("--- %s seconds in SAP ---" % (time.time() - start_time))
data_result = result['DATA']
length_result = len(data_result)
for line in range(0,length_result):
a= data_result[line]["WA"]
output_list.append(a)
print(output_list)
解决方法
首先,我将计时标记放在代码的不同位置,并将其分为功能部分(SAP处理,Excel处理)。
分析时间后,我发现Excel编写代码消耗了最多的运行时间, 考虑时间间隔:
16:52:37.306272
16:52:37.405006 moment it was fetched from SAP
16:52:37.552611 moment it was pushed to Excel
16:52:37.558631
16:52:37.634395 moment it was fetched from SAP
16:52:37.796002 moment it was pushed to Excel
16:52:37.806930
16:52:37.883724 moment it was fetched from SAP
16:52:38.060254 moment it was pushed to Excel
16:52:38.067235
16:52:38.148098 moment it was fetched from SAP
16:52:38.293669 moment it was pushed to Excel
16:52:38.304640
16:52:38.374453 moment it was fetched from SAP
16:52:38.535054 moment it was pushed to Excel
16:52:38.542004
16:52:38.618800 moment it was fetched from SAP
16:52:38.782363 moment it was pushed to Excel
16:52:38.792336
16:52:38.873119 moment it was fetched from SAP
16:52:39.034687 moment it was pushed to Excel
16:52:39.040712
16:52:39.114517 moment it was fetched from SAP
16:52:39.264716 moment it was pushed to Excel
16:52:39.275649
16:52:39.346005 moment it was fetched from SAP
16:52:39.523721 moment it was pushed to Excel
16:52:39.530741
16:52:39.610487 moment it was fetched from SAP
16:52:39.760086 moment it was pushed to Excel
16:52:39.771057
16:52:39.839873 moment it was fetched from SAP
16:52:40.024574 moment it was pushed to Excel
您可以看到Excel写作部分是SAP查询部分的两倍。
代码中的错误是在每次循环迭代中打开/初始化工作簿和工作表,这会大大降低执行速度,并且是多余的,因为您可以从顶部重用wrokbook变量。
另一个多余的事情是去除开头和结尾的零,因为Excel对字符串数据自动执行此操作,所以这是多余的。
此代码变体
try:
wb = openpyxl.load_workbook('new2.xlsx')
ws = wb['Sheet1']
print(datetime.now().time())
for i in range(1,len(ws['A'])+1):
x = ws['A'+ str(i)].value
options = [{ 'TEXT': "BNAME = '" + x +"'"}]
fields = [{'FIELDNAME': 'CLASS'},{'FIELDNAME':'USTYP'}]
ROWS_AT_A_TIME = 10
rowskips = 0
while True:
result = conn.call('RFC_READ_TABLE',QUERY_TABLE = 'USR02',OPTIONS = options,FIELDS = fields,ROWSKIPS = rowskips,ROWCOUNT = ROWS_AT_A_TIME)
rowskips += ROWS_AT_A_TIME
if len(result['DATA']) < ROWS_AT_A_TIME:
break
data_result = result['DATA']
length_result = len(data_result)
for line in range(0,length_result):
ws['B'+str(i)].value = data_result[line]["WA"]
wb.save('new2.xlsx')
print(datetime.now().time())
except ...
为我提供了以下程序运行的时间戳:
>>> exec(open('RFC_READ_TABLE.py').read())
18:14:03.003174
18:16:29.014373
2.5分钟以获取1000条用户记录,对于这种处理而言,这似乎是合理的价格。
,我认为问题出在while True循环中。我认为您需要优化查询逻辑(或更改查询逻辑)。很难不知道您对数据库感兴趣的是什么,其他事情看起来又容易又快速。
可能有帮助的事情是尝试不连续打开和关闭文件:尝试计算“ B”列,然后一次打开所有内容并将其粘贴到xlsx文件中。它可以提供帮助(但我很确定这是查询的问题)
P.S。也许您可以使用一些计时库(like here)来计算大部分时间在哪里。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。