如何使用CLI从AWS / Athena获取对CSV文件的查询的完整结果？

如何解决如何使用CLI从AWS / Athena获取对CSV文件的查询的完整结果？

我需要使用AWS / Athena下载AWS / Glue / Catalog上的完整表格内容。目前，我正在从仪表板运行select * from my_table，并始终从仪表板将结果本地保存为CSV。是否可以使用AWS / CLI获得相同的结果？

从文档中我可以看到https://docs.aws.amazon.com/cli/latest/reference/athena/get-query-results.html，但这并不是我所需要的。

解决方法

您无法从AWS CLI保存结果，但是可以Specify a Query Result Location，Amazon Athena会自动将查询结果的副本保存在您指定的Amazon S3位置。

然后您可以使用AWS CLI下载该结果文件。

您可以使用aws athena start-query-execution API调用通过AWS CLI运行Athena查询。然后，您将需要使用aws athena get-query-execution进行轮询，直到查询完成。在这种情况下，该调用的结果还将包含查询结果在S3上的位置，然后您可以使用aws s3 cp下载该位置。

这是一个示例脚本：

#!/usr/bin/env bash

region=us-east-1 # change this to the region you are using
query='SELECT NOW()' # change this to your query
output_location='s3://example/location' # change this to a writable location

query_execution_id=$(aws athena start-query-execution \
  --region "$region" \
  --query-string "$query" \
  --result-configuration "OutputLocation=$output_location" \
  --query QueryExecutionId \
  --output text)

while true; do
  status=$(aws athena get-query-execution \
    --region "$region" \
    --query-execution-id "$query_execution_id" \
    --query QueryExecution.Status.State \
    --output text)
  if [[ $status != 'RUNNING' ]]; then
    break
  else
    sleep 5
  fi
done

if [[ $status = 'SUCCEEDED' ]]; then
  result_location=$(aws athena get-query-execution \
    --region "$region" \
    --query-execution-id "$query_execution_id" \
    --query QueryExecution.ResultConfiguration.OutputLocation \
    --output text)
  exec aws s3 cp "$result_location" -
else
  reason=$(aws athena get-query-execution \
    --region "$region" \
    --query-execution-id "$query_execution_id" \
    --query QueryExecution.Status.StateChangeReason \
    --output text)
  echo "Query $query_execution_id failed: $reason" 1>&2
  exit 1
fi

如果您的主要工作组具有输出位置，或者您要使用另一个具有定义的输出位置的工作组，则可以相应地修改start-query-execution调用。否则，您可能会有一个名为aws-athena-query-results-NNNNNNN-XX-XXXX-N的S3存储桶，该存储桶由Athena在某个时候创建，并在使用UI时用于输出。

如何使用CLI从AWS / Athena获取对CSV文件的查询的完整结果？

如何解决如何使用CLI从AWS / Athena获取对CSV文件的查询的完整结果？

解决方法

相关推荐