如何解决错误:表格数据集被视为图像数据集Vertex AI Pipelines:自定义训练
我使用 Vertex AI Pipelines 自定义训练表格数据。
- 我运行了下面的 python 代码。
- 我使用生成的 json 创建运行管道。
- 训练开始时出现以下错误。
为什么将表格数据集视为图像数据集?怎么了?
环境
- Python 3.7.3
- kfp==1.6.2
- kfp-pipeline-spec==0.1.7
- kfp-server-api==1.6.0
错误信息
ValueError: ImageDataset class can not be used to retrieve dataset resource projects/nnnnnnnnnnnn/locations/us-central1/datasets/3781554739456507904,check the dataset type
f"{self.__class__.__name__} class can not be used to retrieve "
File "/opt/python3.7/lib/python3.7/site-packages/google/cloud/aiplatform/datasets/dataset.py",line 100,in _validate_metadata_schema_uri
self._validate_metadata_schema_uri()
File "/opt/python3.7/lib/python3.7/site-packages/google/cloud/aiplatform/datasets/dataset.py",line 82,in __init__
return annotation_type(value)
File "/opt/python3.7/lib/python3.7/site-packages/google_cloud_pipeline_components/aiplatform/remote_runner.py",line 176,in cast
value = cast(value,param_type)
File "/opt/python3.7/lib/python3.7/site-packages/google_cloud_pipeline_components/aiplatform/remote_runner.py",line 205,in prepare_parameters
prepare_parameters(serialized_args[METHOD_KEY],method,is_init=False)
File "/opt/python3.7/lib/python3.7/site-packages/google_cloud_pipeline_components/aiplatform/remote_runner.py",line 236,in runner
print(runner(args.cls_name,args.method_name,executor_input,kwargs))
File "/opt/python3.7/lib/python3.7/site-packages/google_cloud_pipeline_components/aiplatform/remote_runner.py",line 280,in main
main()
File "/opt/python3.7/lib/python3.7/site-packages/google_cloud_pipeline_components/aiplatform/remote_runner.py",line 284,in <module>
exec(code,run_globals)
File "/opt/python3.7/lib/python3.7/runpy.py",line 85,in _run_code
"__main__",mod_spec)
File "/opt/python3.7/lib/python3.7/runpy.py",line 193,in _run_module_as_main
Traceback (most recent call last):
Python 代码:
import datetime
from kfp.v2 import dsl,compiler
from kfp.v2.google.client import AIPlatformClient
import google_cloud_pipeline_components.aiplatform as gcc_ai
PROJECT = "my-project"
PIPELINE_NAME = "test-pipeline"
PIPELINE_ROOT_PATH = f"gs://test-pipeline-20210525/{PIPELINE_NAME}"
@dsl.pipeline(
name=PIPELINE_NAME,pipeline_root=PIPELINE_ROOT_PATH
)
def test_pipeline(
display_name: str=f"{PIPELINE_NAME}-2021MMDD-nn"
):
dataset_create_op = gcc_ai.TabularDatasetCreateOp(
project=PROJECT,display_name=display_name,gcs_source="gs://used_apartment/datasource/train.csv"
)
training_job_run_op = gcc_ai.CustomContainerTrainingJobRunOp(
project=PROJECT,container_uri="us-central1-docker.pkg.dev/my-project/dataops-rc2021/custom-train:latest",staging_bucket="vertex_ai_staging_rc2021",base_output_dir="gs://used_apartment/cstm_img_scrf/artifact",model_serving_container_image_uri="us-central1-docker.pkg.dev/my-project/dataops-rc2021/custom-pred:latest",model_serving_container_predict_route="/",model_serving_container_health_route="/health",model_serving_container_ports=[8080],training_fraction_split=0.8,validation_fraction_split=0.1,test_fraction_split=0.1,dataset=dataset_create_op.outputs["dataset"]
)
def run_pipeline(event=None,context=None):
# Compile the pipeline using the kfp.v2.compiler.Compiler
compiler.Compiler().compile(
pipeline_func=test_pipeline,package_path="test-pipeline.json"
)
if __name__ == '__main__':
run_pipeline()
解决方法
这似乎是 CustomContainerTrainingJobRunOp 组件代码中的一个错误。我们能够重现该错误。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。