可以使用以下代码示例来避免Athena查询的元数据进入S3桶。
import boto3
s3 = boto3.resource('s3')
# Create Athena client
athena = boto3.client('athena')
# Set query parameters
database = 'mydatabase'
query = 'SELECT * FROM mytable'
# Execute query
response = athena.start_query_execution(
QueryString=query,
QueryExecutionContext={
'Database': database
},
ResultConfiguration={
'OutputLocation': 's3://mybucket/myquery/'
}
)
# Check if query executed successfully
execution_id = response['QueryExecutionId']
status = athena.get_query_execution(QueryExecutionId=execution_id)['QueryExecution']['Status']['State']
while status in ['QUEUED', 'RUNNING']:
status = athena.get_query_execution(QueryExecutionId=execution_id)['QueryExecution']['Status']['State']
print(f"Query: {query} is {status}")
time.sleep(10)
# Download query result from S3 bucket
if status == 'SUCCEEDED':
results_key = response['QueryExecution']['ResultConfiguration']['OutputLocation'].split('/')[-1]
results_object = s3.Object('mybucket', f'myquery/{results_key}')
results_content = results_object.get()['Body'].read().decode('utf-8')
print(results_content)
# Remove metadata from S3 bucket
if status == 'SUCCEEDED':
del_response = s3.Object('mybucket', f'myquery/{results_key}.metadata').delete()
print(del_response)
在这个示例中,我们首先创建了一个S3
资源和一个Athena
客户端。然后,我们设置了一个database
和一个query
,并通过Athena.start_query_execution()
执行了查询。在执行查询后,我们检查查询是否成功执行,然后从S3桶中下载查询结果。最后,我们使用S3.Object().delete()
方法删除查询结果中的元数据。