如何解决尝试在具有EFS的AWS EKS仅限Fargate上运行Prometheus的权限错误
我有一个仅Fargate的EKS集群。我真的不想自己管理实例。我想将Prometheus部署到它-这需要持久的卷。 As of two months ago this should be possible with EFS(受管理的NFS共享),我觉得我快到了,但是我无法弄清楚当前的问题是什么
我所做的:
- 设置EKS分支机构集群和合适的分支机构配置文件
- 使用适当的安全组设置EFS
- 按照AWS walkthough 安装了CSI驱动程序并验证了EFS。
到目前为止一切正常
我通过以下方式设置了持久性批量声明(据我所知必须是静态完成的):
userSlice
其中
use_backend app1 if app1_url
use_backend app2 if app2_url
和
kubectl apply -f pvc/
然后
tree pvc/
pvc/
├── two_pvc.yml
└── ten_pvc.yml
会发生什么?
prometheus alertmanager的pvc很好用。此部署的其他Pod也是如此,但Prometheus服务器使用以下命令进行崩溃循环回退
cat pvc/*
apiVersion: v1
kind: PersistentVolume
Metadata:
name: efs-pv-two
spec:
capacity:
storage: 2Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
csi:
driver: efs.csi.aws.com
volumeHandle: fs-ec0e1234
apiVersion: v1
kind: PersistentVolume
Metadata:
name: efs-pv-ten
spec:
capacity:
storage: 8Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
csi:
driver: efs.csi.aws.com
volumeHandle: fs-ec0e1234
诊断
helm upgrade --install myrelease-helm-02 prometheus-community/prometheus \
--namespace prometheus \
--set alertmanager.persistentVolume.storageClass="efs-sc",server.persistentVolume.storageClass="efs-sc"
和
invalid capacity 0 on filesystem
kubectl get pv -A
NAME CAPACITY ACCESS MODES RECLaim POLICY STATUS CLaim STORAGECLASS REASON AGE
efs-pv-ten 8Gi RWO Retain Bound prometheus/myrelease-helm-02-prometheus-server efs-sc 11m
efs-pv-two 2Gi RWO Retain Bound prometheus/myrelease-helm-02-prometheus-alertmanager efs-sc 11m
仅显示“错误”
最后,这个(来自同事):
kubectl get pvc -A
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
prometheus myrelease-helm-02-prometheus-alertmanager Bound efs-pv-two 2Gi RWO efs-sc 12m
prometheus myrelease-helm-02-prometheus-server Bound efs-pv-ten 8Gi RWO efs-sc 12m
除了出现权限问题外,我感到困惑-我知道存储可以工作并且可以访问-部署中的另一个Pod似乎对此感到满意-但不是这个。
解决方法
现在就工作-为了共同的利益在这里写下来。感谢/u/EmiiKhaos on reddit给出的建议
问题:
EFS共享仅为root:root
,Prometheus禁止以root用户身份运行pod。
解决方案:
- 为每个需要持久性的Pod创建一个EFS接入点 允许指定用户访问的数量。
- 为永久卷指定这些访问点
- 应用合适的安全上下文以匹配用户身份运行Pod
方法:
创建2个EFS访问点,例如:
{
"Name": "prometheuserver","AccessPointId": "fsap-<hex01>","FileSystemId": "fs-ec0e1234","PosixUser": {
"Uid": 500,"Gid": 500,"SecondaryGids": [
2000
]
},"RootDirectory": {
"Path": "/prometheuserver","CreationInfo": {
"OwnerUid": 500,"OwnerGid": 500,"Permissions": "0755"
}
}
},{
"Name": "prometheusalertmanager","AccessPointId": "fsap-<hex02>","PosixUser": {
"Uid": 501,"Gid": 501,"RootDirectory": {
"Path": "/prometheusalertmanager","CreationInfo": {
"OwnerUid": 501,"OwnerGid": 501,"Permissions": "0755"
}
}
}
更新我的持久卷:
kubectl apply -f pvc/
类似:
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheusalertmanager
spec:
capacity:
storage: 2Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
csi:
driver: efs.csi.aws.com
volumeHandle: fs-ec0e1234::fsap-<hex02>
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheusserver
spec:
capacity:
storage: 8Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
csi:
driver: efs.csi.aws.com
volumeHandle: fs-ec0e1234::fsap-<hex01>
像以前一样重新安装普罗米修斯:
helm upgrade --install myrelease-helm-02 prometheus-community/prometheus \
--namespace prometheus \
--set alertmanager.persistentVolume.storageClass="efs-sc",server.persistentVolume.storageClass="efs-sc"
进行有根据的猜测
kubectl describe pod myrelease-helm-02-prometheus-server -n prometheus
和
kubectl describe pod myrelease-helm-02-prometheus-alert-manager -n prometheus
设置安全上下文时需要指定哪个容器。然后应用安全性上下文来运行带有适当uid:gid
的Pod,例如与
kubectl apply -f setpermissions/
其中
cat setpermissions/*
给予
apiVersion: v1
kind: Pod
metadata:
name: myrelease-helm-02-prometheus-alertmanager
spec:
securityContext:
runAsUser: 501
runAsGroup: 501
fsGroup: 501
volumes:
- name: prometheusalertmanager
containers:
- name: prometheusalertmanager
image: jimmidyson/configmap-reload:v0.4.0
securityContext:
runAsUser: 501
allowPrivilegeEscalation: false
apiVersion: v1
kind: Pod
metadata:
name: myrelease-helm-02-prometheus-server
spec:
securityContext:
runAsUser: 500
runAsGroup: 500
fsGroup: 500
volumes:
- name: prometheusserver
containers:
- name: prometheusserver
image: jimmidyson/configmap-reload:v0.4.0
securityContext:
runAsUser: 500
allowPrivilegeEscalation: false
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。