如何解决K8s SQL2019 HA容器-杜德,我的豆荚在哪里?
K8的新功能。到目前为止,我有以下内容:
- docker-ce-19.03.8
- docker-ce-cli-19.03.8
- containerd.io-1.2.13
- kubelet-1.18.5
- kubeadm-1.18.5
- kubectl-1.18.5
- etcd-3.4.10
- 将绒布用于Pod覆盖网
- 执行了所有主机级别的工作(SELinux许可,交换等)
- 本地Vsphere环境(6.7U3)中的所有Centos7
我已经构建了所有配置,并且当前具有:
- 具有对等和客户端-服务器加密传输的3节点外部/独立etcd群集
- 3节点控制平面群集-kubeadm init用x509s引导,并以3个etcd为目标(因此永远不会发生堆叠的etcd)
- HAProxy和Keepalived安装在两个etcd集群成员上,以平衡访问控制平面(TCP6443)上的API服务器端点的访问权限
- 六工节点
- 使用树内Vmware Cloud Provider配置的存储(我知道它已被弃用)-是的,这是我的DEFAULT SC
状态检查:
- kubectl群集信息报告:
[me@km-01 pods]$ kubectl cluster-info Kubernetes master is running at https://k8snlb:6443 KubeDNS is running at https://k8snlb:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubectl获取所有--all-namespaces报告:
[me@km-01 pods]$ kubectl get all --all-namespaces -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ag1 pod/mssql-operator-68bcc684c4-rbzvn 1/1 Running 0 86m 10.10.4.133 kw-02.bogus.local <none> <none>
kube-system pod/coredns-66bff467f8-k6m94 1/1 Running 4 20h 10.10.0.11 km-01.bogus.local <none> <none>
kube-system pod/coredns-66bff467f8-v848r 1/1 Running 4 20h 10.10.0.10 km-01.bogus.local <none> <none>
kube-system pod/kube-apiserver-km-01.bogus.local 1/1 Running 8 10h x.x.x..25 km-01.bogus.local <none> <none>
kube-system pod/kube-controller-manager-km-01.bogus.local 1/1 Running 2 10h x.x.x..25 km-01.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-7l76c 1/1 Running 0 10h x.x.x..30 kw-01.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-8kft7 1/1 Running 0 10h x.x.x..33 kw-04.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-r5kqv 1/1 Running 0 10h x.x.x..34 kw-05.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-t6xcd 1/1 Running 0 10h x.x.x..35 kw-06.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-vhnx8 1/1 Running 0 10h x.x.x..32 kw-03.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-xdk2n 1/1 Running 0 10h x.x.x..31 kw-02.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-z4kfk 1/1 Running 4 20h x.x.x..25 km-01.bogus.local <none> <none>
kube-system pod/kube-proxy-49hsl 1/1 Running 0 10h x.x.x..35 kw-06.bogus.local <none> <none>
kube-system pod/kube-proxy-62klh 1/1 Running 0 10h x.x.x..34 kw-05.bogus.local <none> <none>
kube-system pod/kube-proxy-64d5t 1/1 Running 0 10h x.x.x..30 kw-01.bogus.local <none> <none>
kube-system pod/kube-proxy-6ch42 1/1 Running 4 20h x.x.x..25 km-01.bogus.local <none> <none>
kube-system pod/kube-proxy-9css4 1/1 Running 0 10h x.x.x..32 kw-03.bogus.local <none> <none>
kube-system pod/kube-proxy-hgrx8 1/1 Running 0 10h x.x.x..33 kw-04.bogus.local <none> <none>
kube-system pod/kube-proxy-ljlsh 1/1 Running 0 10h x.x.x..31 kw-02.bogus.local <none> <none>
kube-system pod/kube-scheduler-km-01.bogus.local 1/1 Running 5 20h x.x.x..25 km-01.bogus.local <none> <none>
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
ag1 service/ag1-primary NodePort 10.104.183.81 x.x.x..30,x.x.x..31,x.x.x..32,x.x.x..33,x.x.x..34,x.x.x..35 1433:30405/TCP 85m role.ag.mssql.microsoft.com/ag1=primary,type=sqlservr
ag1 service/ag1-secondary NodePort 10.102.52.31 x.x.x..30,x.x.x..35 1433:30713/TCP 85m role.ag.mssql.microsoft.com/ag1=secondary,type=sqlservr
ag1 service/mssql1 NodePort 10.96.166.108 x.x.x..30,x.x.x..35 1433:32439/TCP 86m name=mssql1,type=sqlservr
ag1 service/mssql2 NodePort 10.109.146.58 x.x.x..30,x.x.x..35 1433:30636/TCP 86m name=mssql2,type=sqlservr
ag1 service/mssql3 NodePort 10.101.234.186 x.x.x..30,x.x.x..35 1433:30862/TCP 86m name=mssql3,type=sqlservr
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h <none>
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 20h k8s-app=kube-dns
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
kube-system daemonset.apps/kube-flannel-ds-amd64 7 7 7 7 7 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-amd64 app=flannel
kube-system daemonset.apps/kube-flannel-ds-arm 0 0 0 0 0 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-arm app=flannel
kube-system daemonset.apps/kube-flannel-ds-arm64 0 0 0 0 0 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-arm64 app=flannel
kube-system daemonset.apps/kube-flannel-ds-ppc64le 0 0 0 0 0 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-ppc64le app=flannel
kube-system daemonset.apps/kube-flannel-ds-s390x 0 0 0 0 0 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-s390x app=flannel
kube-system daemonset.apps/kube-proxy 7 7 7 7 7 kubernetes.io/os=linux 20h kube-proxy k8s.gcr.io/kube-proxy:v1.18.7 k8s-app=kube-proxy
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
ag1 deployment.apps/mssql-operator 1/1 1 1 86m mssql-operator mcr.microsoft.com/mssql/ha:2019-CTP2.1-ubuntu app=mssql-operator
kube-system deployment.apps/coredns 2/2 2 2 20h coredns k8s.gcr.io/coredns:1.6.7 k8s-app=kube-dns
NAMESPACE NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
ag1 replicaset.apps/mssql-operator-68bcc684c4 1 1 1 86m mssql-operator mcr.microsoft.com/mssql/ha:2019-CTP2.1-ubuntu app=mssql-operator,pod-template-hash=68bcc684c4
kube-system replicaset.apps/coredns-66bff467f8 2 2 2 20h coredns k8s.gcr.io/coredns:1.6.7 k8s-app=kube-dns,pod-template-hash=66bff467f8
要解决的问题:很多文章都在讨论SQL2019 HA构建。看来,每个人都在云中,而我的人在Vsphere环境中是本地的。它们似乎非常简单:按此顺序运行3个脚本:operator.yaml,sql.yaml和ag-service.yaml。
对于随后实际截图环境的博客,至少应有7个容器(1个运算符,3个SQL Init,3个SQL)。如果您查看我前面提到的所有--all-namespaces输出,那么我拥有所有东西(并且处于运行状态),但是除了正在运行的Operator之外没有其他Pod ... ???
我实际上是将控制平面折回为一个单节点,只是为了隔离日志。 / var / log / container / *和/ var / log / pod / *不包含任何值,不能指示存储问题或Pod不存在的任何其他原因。可能还值得注意的是,我开始使用最新的sql2019标签:2019-latest,但是当我在那里遇到相同的行为时,由于很多博客都基于CTP 2.1,所以我决定尝试使用旧的代码。
我可以使用VCP存储提供程序创建PV和PVC。我有我的秘密,可以在“秘密”商店中看到它们。
我不知所措,无法解释为什么缺少pod或在检查journalctl,守护程序本身和/ var / log之后在哪里寻找,我看不出有任何尝试甚至创建它们的迹象-我适应的kubectl apply -f mssql-server2019.yaml运行完成,并且没有错误指示创建了3个sql对象和3个sql服务。但是无论如何,这是针对CTP2.1的文件:
cat << EOF > mssql-server2019.yaml
apiVersion: mssql.microsoft.com/v1
kind: SqlServer
metadata:
labels: {name: mssql1,type: sqlservr}
name: mssql1
namespace: ag1
spec:
acceptEula: true
agentsContainerImage: mcr.microsoft.com/mssql/ha:2019-CTP2.1
availabilityGroups: [ag1]
instanceRootVolumeClaimTemplate:
accessModes: [ReadWriteOnce]
resources:
requests: {storage: 5Gi}
storageClass: default
saPassword:
secretKeyRef: {key: sapassword,name: sql-secrets}
sqlServerContainer: {image: 'mcr.microsoft.com/mssql/server:2019-CTP2.1'}
---
apiVersion: v1
kind: Service
metadata: {name: mssql1,namespace: ag1}
spec:
ports:
- {name: tds,port: 1433}
selector: {name: mssql1,type: sqlservr}
type: NodePort
externalIPs:
- x.x.x.30
- x.x.x.31
- x.x.x.32
- x.x.x.33
- x.x.x.34
- x.x.x.35
---
apiVersion: mssql.microsoft.com/v1
kind: SqlServer
metadata:
labels: {name: mssql2,type: sqlservr}
name: mssql2
namespace: ag1
spec:
acceptEula: true
agentsContainerImage: mcr.microsoft.com/mssql/ha:2019-CTP2.1
availabilityGroups: [ag1]
instanceRootVolumeClaimTemplate:
accessModes: [ReadWriteOnce]
resources:
requests: {storage: 5Gi}
storageClass: default
saPassword:
secretKeyRef: {key: sapassword,name: sql-secrets}
sqlServerContainer: {image: 'mcr.microsoft.com/mssql/server:2019-CTP2.1'}
---
apiVersion: v1
kind: Service
metadata: {name: mssql2,port: 1433}
selector: {name: mssql2,type: sqlservr}
type: NodePort
externalIPs:
- x.x.x.30
- x.x.x.31
- x.x.x.32
- x.x.x.33
- x.x.x.34
- x.x.x.35
---
apiVersion: mssql.microsoft.com/v1
kind: SqlServer
metadata:
labels: {name: mssql3,type: sqlservr}
name: mssql3
namespace: ag1
spec:
acceptEula: true
agentsContainerImage: mcr.microsoft.com/mssql/ha:2019-CTP2.1
availabilityGroups: [ag1]
instanceRootVolumeClaimTemplate:
accessModes: [ReadWriteOnce]
resources:
requests: {storage: 5Gi}
storageClass: default
saPassword:
secretKeyRef: {key: sapassword,name: sql-secrets}
sqlServerContainer: {image: 'mcr.microsoft.com/mssql/server:2019-CTP2.1'}
---
apiVersion: v1
kind: Service
metadata: {name: mssql3,port: 1433}
selector: {name: mssql3,type: sqlservr}
type: NodePort
externalIPs:
- x.x.x.30
- x.x.x.31
- x.x.x.32
- x.x.x.33
- x.x.x.34
- x.x.x.35
---
EOF
Edit1:kubectl日志-n和mssql-operator-*
[sqlservers] 2020/08/14 14:36:48 Creating custom resource definition
[sqlservers] 2020/08/14 14:36:48 Created custom resource definition
[sqlservers] 2020/08/14 14:36:48 Waiting for custom resource definition to be available
[sqlservers] 2020/08/14 14:36:49 Watching for resources...
[sqlservers] 2020/08/14 14:37:08 Creating ConfigMap sql-operator
[sqlservers] 2020/08/14 14:37:08 Updating mssql1 in namespace ag1 ...
[sqlservers] 2020/08/14 14:37:08 Creating ConfigMap ag1
[sqlservers] ERROR: 2020/08/14 14:37:08 could not process update request: error creating ConfigMap ag1: v1.ConfigMap: ObjectMeta: v1.ObjectMeta: readObjectFieldAsBytes: expect : after object field,parsing 627 ...:{},"k:{\"... at {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"ag1","namespace":"ag1","selfLink":"/api/v1/namespaces/ag1/configmaps/ag1","uid":"33af6232-4464-4290-bb14-b21e8f72e361","resourceVersion":"314186","creationTimestamp":"2020-08-14T14:37:08Z","ownerReferences":[{"apiVersion":"mssql.microsoft.com/v1","kind":"ReplicationController","name":"mssql1","uid":"e71a7246-2776-4d96-9735-844ee136a37d","controller":false}],"managedFields":[{"manager":"mssql-server-k8s-operator","operation":"Update","time":"2020-08-14T14:37:08Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"e71a7246-2776-4d96-9735-844ee136a37d\"}":{".":{},"f:apiVersion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}}}}]}}
[sqlservers] 2020/08/14 14:37:08 Updating ConfigMap sql-operator
[sqlservers] 2020/08/14 14:37:08 Updating mssql2 in namespace ag1 ...
[sqlservers] ERROR: 2020/08/14 14:37:08 could not process update request: error getting ConfigMap ag1: v1.ConfigMap: ObjectMeta: v1.ObjectMeta: readObjectFieldAsBytes: expect : after object field,"f:uid":{}}}}}}]}}
[sqlservers] 2020/08/14 14:37:08 Updating ConfigMap sql-operator
[sqlservers] 2020/08/14 14:37:08 Updating mssql3 in namespace ag1 ...
[sqlservers] ERROR: 2020/08/14 14:37:08 could not process update request: error getting ConfigMap ag1: v1.ConfigMap: ObjectMeta: v1.ObjectMeta: readObjectFieldAsBytes: expect : after object field,"f:uid":{}}}}}}]}}
我已经查看了我的运算符和mssql2019.yamls(特别是关于SqlServer的类型,因为这似乎是失败的地方),并且无法识别任何明显的不一致或差异。
解决方法
您的操作员正在运行:
ag1 pod/pod/mssql-operator-68bcc684c4-rbzvn 1/1 Running 0 86m 10.10.4.133 kw-02.bogus.local <none> <none>
我将从查看那里的日志开始:
kubectl -n ag1 logs pod/mssql-operator-68bcc684c4-rbzvn
最有可能需要与云提供商(即Azure)进行交互,并且不支持VMware,但请查看日志中显示的内容。
更新:
根据您发布的日志,看起来您正在使用K8s 1.18,并且该运算符不兼容。它正在尝试创建一个kube-apiserver拒绝的规范的ConfigMap。
✌️
,按以下顺序运行3个脚本:operator.yaml,sql.yaml和ag-service.yaml。
我刚刚在我的GKE集群上运行它,并且如果我尝试仅运行这三个文件,则会得到类似的结果。
如果您在未准备PV和PVC(.././sample-deployment-script/templates/pv*.yaml
)的情况下运行它
$ git clone https://github.com/microsoft/sql-server-samples.git
$ cd sql-server-samples/samples/features/high\ availability/Kubernetes/sample-manifest-files/
$ kubectl create -f operator.yaml
namespace/ag1 created
serviceaccount/mssql-operator created
clusterrole.rbac.authorization.k8s.io/mssql-operator-ag1 created
clusterrolebinding.rbac.authorization.k8s.io/mssql-operator-ag1 created
deployment.apps/mssql-operator created
$ kubectl create -f sqlserver.yaml
sqlserver.mssql.microsoft.com/mssql1 created
service/mssql1 created
sqlserver.mssql.microsoft.com/mssql2 created
service/mssql2 created
sqlserver.mssql.microsoft.com/mssql3 created
service/mssql3 created
$ kubectl create -f ag-services.yaml
service/ag1-primary created
service/ag1-secondary created
您将拥有:
kubectl get pods -n ag1
NAME READY STATUS RESTARTS AGE
mssql-initialize-mssql1-js4zc 0/1 CreateContainerConfigError 0 6m12s
mssql-initialize-mssql2-72d8n 0/1 CreateContainerConfigError 0 6m8s
mssql-initialize-mssql3-h4mr9 0/1 CreateContainerConfigError 0 6m6s
mssql-operator-658558b57d-6xd95 1/1 Running 0 6m33s
mssql1-0 1/2 CrashLoopBackOff 5 6m12s
mssql2-0 1/2 CrashLoopBackOff 5 6m9s
mssql3-0 0/2 Pending 0 6m6s
我发现失败的mssql<N>
吊舱是statefulset.apps/mssql<N>
的一部分,而mssql-initialize-mssql<N>
是job.batch/mssql-initialize-mssql<N>
的一部分
添加PV和PVC后,其外观如下:
$ kubectl get all -n ag1
NAME READY STATUS RESTARTS AGE
mssql-operator-658558b57d-pgx74 1/1 Running 0 20m
还有3个sqlservers.mssql.microsoft.com
对象
$ kubectl get sqlservers.mssql.microsoft.com -n ag1
NAME AGE
mssql1 64m
mssql2 64m
mssql3 64m
这就是为什么它看起来与上述文件中指定的完全相同的原因。
任何帮助将不胜感激。
但是,如果您运行:
sql-server-samples/samples/features/high availability/Kubernetes/sample-deployment-script/$ ./deploy-ag.py deploy --dry-run
配置将自动生成。
没有空运行,并且进行了配置(并正确设置了PV + PVC),因此可以提供7个豆荚。
您将生成配置。将自动生成的配置与您拥有的配置进行比较(并比较仅运行子集3文件与deploy-ag.py
中的内容)会很有用
P.S。
$ kubectl version
Client Version: version.Info{Major:"1",Minor:"15+" GitVersion:"v1.15.11-dispatcher"
Server Version: version.Info{Major:"1",Minor:"15+" GitVersion:"v1.15.12-gke.2"
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。