微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

尝试在具有EFS的AWS EKS仅限Fargate上运行Prometheus的权限错误

如何解决尝试在具有EFS的AWS EKS仅限Fargate上运行Prometheus的权限错误

我有一个仅Fargate的EKS集群。我真的不想自己管理实例。我想将Prometheus部署到它-这需要持久的卷。 As of two months ago this should be possible with EFS(受管理的NFS共享),我觉得我快到了,但是我无法弄清楚当前的问题是什么

我所做的:

  • 设置EKS分支机构集群和合适的分支机构配置文件
  • 使用适当的安全组设置EFS
  • 按照AWS walkthough
  • 安装了CSI驱动程序并验证了EFS。

到目前为止一切正常

我通过以下方式设置了持久性批量声明(据我所知必须是静态完成的):

userSlice

其中

use_backend app1 if app1_url
use_backend app2 if app2_url

kubectl apply -f pvc/

然后

tree pvc/
pvc/
├── two_pvc.yml
└── ten_pvc.yml

会发生什么?

prometheus alertmanager的pvc很好用。此部署的其他Pod也是如此,但Prometheus服务器使用以下命令进行崩溃循环回退

cat pvc/*

apiVersion: v1
kind: PersistentVolume
Metadata:
  name: efs-pv-two
spec:
  capacity:
    storage: 2Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-ec0e1234
apiVersion: v1
kind: PersistentVolume
Metadata:
  name: efs-pv-ten
spec:
  capacity:
    storage: 8Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-ec0e1234

诊断

helm upgrade --install myrelease-helm-02 prometheus-community/prometheus \
    --namespace prometheus \
    --set alertmanager.persistentVolume.storageClass="efs-sc",server.persistentVolume.storageClass="efs-sc"

invalid capacity 0 on filesystem

kubectl get pv -A NAME CAPACITY ACCESS MODES RECLaim POLICY STATUS CLaim STORAGECLASS REASON AGE efs-pv-ten 8Gi RWO Retain Bound prometheus/myrelease-helm-02-prometheus-server efs-sc 11m efs-pv-two 2Gi RWO Retain Bound prometheus/myrelease-helm-02-prometheus-alertmanager efs-sc 11m 显示错误

最后,这个(来自同事):

kubectl get pvc -A
NAMESPACE    NAME                                     STATUS   VOLUME       CAPACITY   ACCESS MODES   STORAGECLASS   AGE
prometheus   myrelease-helm-02-prometheus-alertmanager   Bound    efs-pv-two   2Gi        RWO            efs-sc         12m
prometheus   myrelease-helm-02-prometheus-server         Bound    efs-pv-ten   8Gi        RWO            efs-sc         12m

除了出现权限问题外,我感到困惑-我知道存储可以工作并且可以访问-部署中的另一个Pod似乎对此感到满意-但不是这个。

解决方法

现在就工作-为了共同的利益在这里写下来。感谢/u/EmiiKhaos on reddit给出的建议

问题:

EFS共享仅为root:root,Prometheus禁止以root用户身份运行pod。

解决方案:

  • 为每个需要持久性的Pod创建一个EFS接入点 允许指定用户访问的数量。
  • 为永久卷指定这些访问点
  • 应用合适的安全上下文以匹配用户身份运行Pod

方法:

创建2个EFS访问点,例如:

{
    "Name": "prometheuserver","AccessPointId": "fsap-<hex01>","FileSystemId": "fs-ec0e1234","PosixUser": {
        "Uid": 500,"Gid": 500,"SecondaryGids": [
            2000
        ]
    },"RootDirectory": {
        "Path": "/prometheuserver","CreationInfo": {
            "OwnerUid": 500,"OwnerGid": 500,"Permissions": "0755"
        }
    }
},{
    "Name": "prometheusalertmanager","AccessPointId": "fsap-<hex02>","PosixUser": {
        "Uid": 501,"Gid": 501,"RootDirectory": {
        "Path": "/prometheusalertmanager","CreationInfo": {
            "OwnerUid": 501,"OwnerGid": 501,"Permissions": "0755"
        }
    }
}

更新我的持久卷:

kubectl apply -f pvc/

类似:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheusalertmanager
spec:
  capacity:
    storage: 2Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-ec0e1234::fsap-<hex02>
---    
apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheusserver
spec:
  capacity:
    storage: 8Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-ec0e1234::fsap-<hex01>

像以前一样重新安装普罗米修斯:

helm upgrade --install myrelease-helm-02 prometheus-community/prometheus \
    --namespace prometheus \
    --set alertmanager.persistentVolume.storageClass="efs-sc",server.persistentVolume.storageClass="efs-sc"

进行有根据的猜测

kubectl describe pod myrelease-helm-02-prometheus-server -n prometheus

kubectl describe pod myrelease-helm-02-prometheus-alert-manager -n prometheus

设置安全上下文时需要指定哪个容器。然后应用安全性上下文来运行带有适当uid:gid的Pod,例如与

kubectl apply -f setpermissions/

其中

cat setpermissions/*

给予

apiVersion: v1
kind: Pod
metadata:
  name: myrelease-helm-02-prometheus-alertmanager
spec:
  securityContext:
    runAsUser: 501
    runAsGroup: 501
    fsGroup: 501
  volumes:
    - name: prometheusalertmanager
  containers:
    - name: prometheusalertmanager
      image: jimmidyson/configmap-reload:v0.4.0
      securityContext:
        runAsUser: 501
        allowPrivilegeEscalation: false        
apiVersion: v1
kind: Pod
metadata:
  name: myrelease-helm-02-prometheus-server
spec:
  securityContext:
    runAsUser: 500
    runAsGroup: 500
    fsGroup: 500
  volumes:
    - name: prometheusserver
  containers:
    - name: prometheusserver
      image: jimmidyson/configmap-reload:v0.4.0
      securityContext:
        runAsUser: 500
        allowPrivilegeEscalation: false

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。