要解决AWS Blazing Text监督式超参数未记录客观指标的问题,可以采取以下步骤:
确保您正确地设置了训练作业并指定了正确的超参数。确保以下参数设置正确:
objective参数设置为supervised,以启用监督式学习。objective_metric参数设置为您想要记录的客观指标。例如,可以设置为accuracy来记录准确率。metric_definitions参数可以用于定义记录的客观指标的格式。例如,[{"Name": "accuracy", "Regex": "accuracy: ([0-9\\.]+)"}]可以用于匹配准确率。在训练作业中添加一个输出路径,以便存储训练作业的输出。例如,可以将输出路径设置为S3存储桶。
下面是一个使用AWS SDK for Python(Boto3)的示例代码,展示如何设置Blazing Text训练作业并记录客观指标:
import boto3
# 定义超参数
hyperparameters = {
"mode": "supervised",
"epochs": 10,
"learning_rate": 0.01,
"vector_dim": 10,
"objective_metric": "accuracy",
"metric_definitions": [{"Name": "accuracy", "Regex": "accuracy: ([0-9\\.]+)"}]
}
# 设置训练作业输入路径
input_data_path = "s3://your-input-data-path"
# 设置训练作业输出路径
output_data_path = "s3://your-output-data-path"
# 创建Blazing Text训练作业
client = boto3.client("sagemaker")
response = client.create_training_job(
TrainingJobName="blazing-text-training-job",
AlgorithmSpecification={
"TrainingImage": "blazingtext:latest",
"TrainingInputMode": "File"
},
HyperParameters=hyperparameters,
InputDataConfig=[
{
"ChannelName": "train",
"DataSource": {
"S3DataSource": {
"S3DataType": "S3Prefix",
"S3Uri": input_data_path,
"S3DataDistributionType": "FullyReplicated"
}
},
"ContentType": "text/plain",
"CompressionType": "None"
}
],
OutputDataConfig={
"S3OutputPath": output_data_path
},
ResourceConfig={
"InstanceType": "ml.c4.2xlarge",
"InstanceCount": 1,
"VolumeSizeInGB": 30
},
StoppingCondition={
"MaxRuntimeInSeconds": 86400
},
RoleArn="arn:aws:iam::your-role-arn"
)
# 等待训练作业完成
response = client.describe_training_job(TrainingJobName="blazing-text-training-job")
status = response["TrainingJobStatus"]
while status == "InProgress":
response = client.describe_training_job(TrainingJobName="blazing-text-training-job")
status = response["TrainingJobStatus"]
# 获取训练作业的输出指标
metric = response["FinalMetricDataList"][0]
metric_name = metric["MetricName"]
metric_value = metric["Value"]
print(f"Final {metric_name}: {metric_value}")
请根据您的具体情况修改相关参数和路径。这段代码将设置Blazing Text训练作业,并在训练完成后打印出最终的客观指标。
希望这个示例代码能帮助您解决问题。