AWS textract - 不支持的文档异常_编程开发

AWS textract - 不支持的文档异常

创始人

2024-11-18 12:00:17

0次

当使用AWS Textract API时，如果出现“不支持的文档异常”错误，表示输入的文档类型不受支持。以下是解决方法的代码示例：

import boto3

def detect_text_from_document(document_path):
    # 创建Textract客户端
    client = boto3.client('textract', region_name='us-west-2')

    # 读取文档内容
    with open(document_path, 'rb') as file:
        document = file.read()

    # 发起文本检测请求
    try:
        response = client.detect_document_text(Document={'Bytes': document})
        # 提取文本结果
        detected_text = response['Blocks'][1]['Text']

        return detected_text

    except client.exceptions.UnsupportedDocumentException:
        # 如果出现不支持的文档异常，可以添加逻辑处理或者打印错误消息
        print("不支持的文档类型")
        return None

# 指定文档路径并调用函数
document_path = 'example_document.pdf'
text = detect_text_from_document(document_path)
if text:
    print(text)

上述代码使用AWS SDK for Python（Boto3）创建了一个Textract客户端，并读取了指定文档的内容。然后，它发起了一个文本检测请求并提取了返回结果中的文本。如果请求失败并抛出了UnsupportedDocumentException异常，则会打印出"不支持的文档类型"的错误消息。

请注意，上述代码仅演示了如何处理不支持的文档异常，并没有针对特定的文档类型进行处理。在实际使用时，你可能需要根据不同的文档类型采取不同的处理方法。

上一篇：aws textract - 按段落对输出行进行分组

下一篇：AWS Textract - UnsupportedDocumentException - PDF AWS Textract - 不支持的文档异常 - PDF

AWS textract - 不支持的文档异常

相关内容

热门资讯