如何解决AI Platform Pipelines 实例部署失败
我们正在使用 AI Platform Pipelines 来管理 GKE 集群上的 Kubeflow Pipelines 安装。但是,通过 UI 进行的典型部署过程似乎已停止工作。当我尝试将管道实例部署到现有集群时,遇到错误:
Failed to create CustomResourceDefinition.
{"metadata":{},"status":"Failure","message":"CustomResourceDefinition.apiextensions.k8s.io \"applications.app.k8s.io\" is invalid: [spec.versions[0].schema.openAPIV3Schema: Required value: schemas are required,spec.versions: Invalid value: []apiextensions.CustomResourceDefinitionVersion{apiextensions.CustomResourceDefinitionVersion{Name:\"v1beta1\",Served:false,Storage:false,Schema:(*apiextensions.CustomResourceValidation)(nil),Subresources:(*apiextensions.CustomResourceSubresources)(nil),AdditionalPrinterColumns:[]apiextensions.CustomResourceColumnDefinition(nil)}}: must have exactly one version marked as storage version,status.storedVersions: Invalid value: []string(nil): must have at least one stored version,metadata.annotations[api-approved.kubernetes.io]: Required value: protected groups must have approval annotation \"api-approved.kubernetes.io\",see https://github.com/kubernetes/enhancements/pull/1111]","reason":"Invalid","details":{"name":"applications.app.k8s.io","group":"apiextensions.k8s.io","kind":"CustomResourceDefinition","causes":[{"reason":"FieldValueRequired","message":"Required value: schemas are required","field":"spec.versions[0].schema.openAPIV3Schema"},{"reason":"FieldValueInvalid","message":"Invalid value: []apiextensions.CustomResourceDefinitionVersion{apiextensions.CustomResourceDefinitionVersion{Name:\"v1beta1\",AdditionalPrinterColumns:[]apiextensions.CustomResourceColumnDefinition(nil)}}: must have exactly one version marked as storage version","field":"spec.versions"},"message":"Invalid value: []string(nil): must have at least one stored version","field":"status.storedVersions"},{"reason":"FieldValueRequired","message":"Required value: protected groups must have approval annotation \"api-approved.kubernetes.io\",see https://github.com/kubernetes/enhancements/pull/1111","field":"metadata.annotations[api-approved.kubernetes.io]"}]},"code":422}
我尝试了多个集群、两个单独的项目、两个单独的区域和三个不同版本的 GKE: 1.18.17-gke.700 、 1.18.17-gke.1200 和 1.19.9-gke。 1900。在所有情况下,都会发生相同的错误。集群满足 GCP documentation 中概述的资源要求。
此处没有大量信息,但我不确定如何调试此问题。如果有我可以收集的其他有用的信息,请告诉我。我无法确定正在使用的 Kubeflow 管道的版本,据我所知,在创建实例之后才可见。
这是我应该与 GCP 支持人员讨论的问题吗?或者我是否应该尝试进一步挖掘错误?我试图四处寻找上述失败消息中包含的一些特定错误,但没有找到太多。提到的拉取请求已经合并:https://github.com/kubernetes/enhancements/pull/1111
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。