nodeAffinity 和 nodeAntiAffinity 被忽略

如何解决nodeAffinity 和 nodeAntiAffinity 被忽略

我遇到了一个问题，我试图将部署限制为工作避免特定节点池，但 nodeAffinity 和 nodeAntiAffinity 似乎不起作用。

我们正在运行 DOKS（数字海洋管理的 Kubernetes）v1.19.3
我们有两个节点池：infra 和 clients，两个节点都标记为这样
在这种情况下，我们希望避免部署到标记为“infra”的节点

无论出于何种原因，似乎无论我使用什么配置，Kubernetes 似乎都在两个节点池之间随机调度。

看下面的配置，以及调度的结果

deployment.yaml 片段

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress
  namespace: "test"
  labels:
    app: wordpress
    client: "test"
    product: hosted-wordpress
    version: v1
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      app: wordpress
      client: "test"
  template:
    metadata:
      labels:
        app: wordpress
        client: "test"
        product: hosted-wordpress
        version: v1
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                - key: doks.digitalocean.com/node-pool
                  operator: NotIn
                  values:
                  - infra

节点描述片段 注意标签“doks.digitalocean.com/node-pool=infra”

kubectl describe node infra-3dmga

Name:               infra-3dmga
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=s-2vcpu-4gb
                    beta.kubernetes.io/os=linux
                    doks.digitalocean.com/node-id=67d84a52-8d08-4b19-87fe-1d837ba46eb6
                    doks.digitalocean.com/node-pool=infra
                    doks.digitalocean.com/node-pool-id=2e0f2a1d-fbfa-47e9-9136-c897e51c014a
                    doks.digitalocean.com/version=1.19.3-do.2
                    failure-domain.beta.kubernetes.io/region=tor1
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=infra-3dmga
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=s-2vcpu-4gb
                    region=tor1
                    topology.kubernetes.io/region=tor1
Annotations:        alpha.kubernetes.io/provided-node-ip: 10.137.0.230
                    csi.volume.kubernetes.io/nodeid: {"dobs.csi.digitalocean.com":"222551559"}
                    io.cilium.network.ipv4-cilium-host: 10.244.0.139
                    io.cilium.network.ipv4-health-ip: 10.244.0.209
                    io.cilium.network.ipv4-pod-cidr: 10.244.0.128/25
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun,20 Dec 2020 20:17:20 -0800
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  infra-3dmga
  AcquireTime:     <unset>
  RenewTime:       Fri,12 Feb 2021 08:04:09 -0800

有时会导致

kubectl get po -n test -o wide

NAME                         READY   STATUS    RESTARTS   AGE   IP             NODE          NOMINATED NODE   READINESS GATES
wordpress-5bfcb6f44b-2j7kv   5/5     Running   0          1h   10.244.0.107   infra-3dmga   <none>           <none>

其他时候结果

kubectl get po -n test -o wide

NAME                         READY   STATUS    RESTARTS   AGE   IP             NODE          NOMINATED NODE   READINESS GATES
wordpress-5bfcb6f44b-b42wj   5/5     Running   0          5m   10.244.0.107   clients-3dmem   <none>           <none>

我已经尝试使用 nodeAntiAffinity 来达到类似的效果。

最后，我什至尝试创建测试标签，而不是使用来自 Digital Ocean 的内置标签，我得到了同样的效果（Affinity 似乎根本不适合我）。

我希望有人能帮助我解决甚至指出我的配置中的一个愚蠢的错误，因为这个问题一直让我发疯地试图解决它（而且它也是一个有用的功能，当它起作用时）。

谢谢，

解决方法

好消息！

我终于解决了这个问题。

问题当然是“用户错误”。

在配置的更下方有一个额外的 Spec 行，非常隐蔽。

最初，在切换到 StatefulSets 之前，我们使用的是 Deployments，我有一个 pod Spec 主机名条目，它覆盖了文件顶部的 Spec。

感谢 @WytrzymałyWiktor 和 @Manjul 的建议！

在部署文件中，您提到了 operator: NotIn 用作反关联性。

请使用operator: In来实现节点亲和。例如，如果我们希望 Pod 使用具有 clients 标签的节点。

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress
  namespace: "test"
  labels:
    app: wordpress
    client: "test"
    product: hosted-wordpress
    version: v1
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      app: wordpress
      client: "test"
  template:
    metadata:
      labels:
        app: wordpress
        client: "test"
        product: hosted-wordpress
        version: v1
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                - key: "doks.digitalocean.com/node-pool"
                  operator: In
                  values: ["clients"] ##Pls use correct label

nodeAffinity 和 nodeAntiAffinity 被忽略

如何解决nodeAffinity 和 nodeAntiAffinity 被忽略

解决方法

相关推荐