微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

将 Pandas 数据帧写入 Azure Blob - Python sdk

如何解决将 Pandas 数据帧写入 Azure Blob - Python sdk

我正在尝试将数据帧作为 csv 上传到 blob。

以下是我的代码

from azure.storage.blob import BlobClient
sas_url = "https://XXX.blob.core.windows.net/YYYY?sp=r&st=2021-04-26T16:21:37Z&se=2021-04-27T00:21:37Z&spr=" \
          "https&sv=2020-02-10&sr=c&sig=lJxx45wdBT%2F5ZJQwPxxxxxxxxx0%3D"
blob_client = BlobClient.from_blob_url(sas_url)
print (blob_client)
blob_client.upload_blob(data=df1.to_csv(index=False))

错误是面对:

Traceback (most recent call last):
  File "C:\xxx\xxx\PycharmProjects\DIF\venv\lib\site-packages\IPython\core\interactiveshell.py",line 3437,in run_code
    exec(code_obj,self.user_global_ns,self.user_ns)
  File "<ipython-input-2-40ff66c54682>",line 1,in <module>
    runfile('C:/xxx/xxx/PycharmProjects/DIF/venv/Scripts/SF_ADLS.py',wdir='C:/xxx/xxx/PycharmProjects/DIF/venv/Scripts')
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.4\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py",line 197,in runfile
    pydev_imports.execfile(filename,global_vars,local_vars)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.4\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py",line 18,in execfile
    exec(compile(contents+"\n",file,'exec'),glob,loc)
  File "C:/xxx/xxx/PycharmProjects/DIF/venv/Scripts/SF_ADLS.py",line 99,in <module>
    blob_client = BlobClient.from_blob_url(sas_url)
  File "C:\Users\User\PycharmProjects\DIF\venv\lib\site-packages\azure\storage\blob\_blob_client.py",line 246,in from_blob_url
    container_name,blob_name = unquote(path_blob[-2]),unquote(path_blob[-1])
IndexError: list index out of range

第二种方法: 通过 python 代码生成 SAS 令牌:

from datetime import datetime,timedelta
from azure.storage.blob import BlobServiceClient,generate_account_sas,ResourceTypes,AccountSasPermissions
import pandas as pd

df1 = pd.read_csv(r'C:\ccc\ccc\AppData\Roaming\JetBrains\PyCharmCE2020.1\scratches\sf_Metadata.csv')
sas_token = generate_account_sas(
    account_name="acct",account_key="so1uwLUIrFluxxxxxx38MGpL5XKU/yFNIkiyyyyitQPrWQ==",resource_types=ResourceTypes(service=True),permission=AccountSasPermissions(read=True,write=True,delete=True,add=True,create=True,update=True),expiry=datetime.utcNow() + timedelta(hours=1)
)

blob_service_client = BlobServiceClient(account_url="https://acct.blob.core.windows.net",credential=sas_token)
print (sas_token)
blob_client = blob_service_client.get_blob_client('testfs1','one',snapshot=None)
blob_client.upload_blob(data=df1.to_csv(index=False))

我面临的错误

azure.core.exceptions.HttpResponseError: This request is not authorized to perform this operation using this resource type.
RequestId:03e71e74-601e-0022-2f25-3be77a000000
Time:2021-04-27T05:24:51.5741680Z
ErrorCode:AuthorizationResourceTypeMismatch
Error:None

你能告诉我我的代码要做哪些改变吗? 谢谢。

解决方法

根据官方文档的定义,你的sas_url是错误的,你缺少blob-name

https://<account-name>.blob.core.windows.net/<container-name>/<blob-name>?<sas-token>

你可以参考这个example

您最好在此处生成 SAS Token

enter image description here

如果在此处生成SAS Token,可能会出现认证失败错误:

enter image description here

======================更新================= ==

请更改

resource_types=ResourceTypes(service=True)

resource_types=ResourceTypes(object=True)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。