如何解决并行天蓝色blob上传得到警告“ urllib3.connectionpool警告-连接池已满,正在丢弃连接”
由于我需要将大量超过100000的文件上传到Azure Blob存储,因此我编写了一个程序,通过这样的多线程处理来上传。
from azure.storage.blob import BlobServiceClient,BlobClient
from itertools import repeat
from concurrent.futures import ThreadPoolExecutor
import os
def upload_single_blob(blob_service_client,blob_path):
# Create a blob client using the local file name as the name for the blob
blob_client = blob_service_client.get_blob_client(container='MyContainer',blob=blob_path)
# Upload the file
with open(blob_path,"rb") as data:
blob_client.upload_blob(data)
# make blob service client from connect str
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
# make file path list to upload
blob_path_list = os.listdir("./blob_files/")
blob_path_list = map(lambda x: "./blob_files/"+x,blob_path_list)
blob_path_list = list(blob_path_list)
# multi threading upload to blob
with ThreadPoolExecutor(max_workers=100) as executor:
executor.map(upload_single_blob,repeat(blob_service_client),blob_path_list)
但是,当我在azure VM(操作系统为ubuntu18.04)上运行该程序时,得到了很多警告。
urllib3.connectionpool WARNING --Connection pool is full,discarding connection: myblobaccount.blob.core.windows.net
我没有精确测量它,但是即使同时上传100个线程,似乎同时只有大约10个连接。
如何再增加连接数?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。