在AWS Transcribe调用时,设置“SpeakerIdentification”为“True”,它将在转录文本中识别不同的演讲者。使用此功能,我们可以使用“results.speaker_labels”来访问识别出的演讲者和对应的时间戳。以下是Python代码示例:
import boto3 transcribe = boto3.client('transcribe')
job_name = "test-job"
file_uri = "s3://test-bucket/test-audio-file.mp3"
transcribe.start_transcription_job( TranscriptionJobName=job_name, Media={"MediaFileUri": file_uri}, MediaFormat="mp3", LanguageCode="en-US", # Enables speaker identification Settings={"ShowSpeakerLabels": True}, )
while True: status = transcribe.get_transcription_job(TranscriptionJobName=job_name)["TranscriptionJob"]["TranscriptionJobStatus"] if status in ["COMPLETED", "FAILED"]: break # Wait for the job to finish
results = transcribe.get_transcription_job(TranscriptionJobName=job_name)["TranscriptionJob"]["Transcript"]["Results"] for result in results: if "SpeakerLabel" in result: speaker_label = result["SpeakerLabel"] start_time = result["StartTime"] end_time = result["EndTime"] transcript = result["Alternatives"][0]["Transcript"] print(f"Speaker {speaker_label} ({start_time} - {end_time}): {transcript}")
这将输出转录文本和对应的演讲者标签。
上一篇:AWS状态机执行输入是一个数组