微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

由于错误 net/http: TLS 握手超时,kubelet 服务无法使用 https 访问端口 6443 上的 kube-apiserver

如何解决由于错误 net/http: TLS 握手超时,kubelet 服务无法使用 https 访问端口 6443 上的 kube-apiserver

我正在通过集群 API 在 openstack 顶部配置一个具有一个控制平面节点和一个工作节点的工作负载集群。但是,kubernetes 控制平面在控制平面节点中无法正常启动。

我可以看到 kube-apiserver 不断退出并重新创建:

ubuntu@ubu1910-medflavor-nolb3-control-plane-nh4hf:~$ sudo crictl --runtime-endpoint /run/containerd/containerd.sock ps -a
CONTAINER           IMAGE               CREATED              STATE               NAME                      ATTEMPT             POD ID
a729fdd387b0a       90d27391b7808       About a minute ago   Running             kube-apiserver            74                  88de61a0459f6
38b54a71cb0aa       90d27391b7808       3 minutes ago        Exited              kube-apiserver            73                  88de61a0459f6
24573a1c5adc5       b0f1517c1f4bb       18 minutes ago       Running             kube-controller-manager   4                   cc113aaae13b5
a2072b64cca1a       b0f1517c1f4bb       29 minutes ago       Exited              kube-controller-manager   3                   cc113aaae13b5
f26a531972518       d109c0821a2b9       5 hours ago          Running             kube-scheduler            1                   df1d15fd61a8f
a91b4c0ce9e27       303ce5db0e90d       5 hours ago          Running             etcd                      1                   16e1f0f5bb543
1565a1a7dedec       303ce5db0e90d       5 hours ago          Exited              etcd                      0                   16e1f0f5bb543
35ae23eb64f11       d109c0821a2b9       5 hours ago          Exited              kube-scheduler            0                   df1d15fd61a8f
ubuntu@ubu1910-medflavor-nolb3-control-plane-nh4hf:~$

从 kube-apiserver 容器的日志中我可以看到“http: TLS handshake error from 172.24.4.159:50812: EOF”:

ubuntu@ubu1910-medflavor-nolb3-control-plane-nh4hf:~$ sudo crictl --runtime-endpoint /run/containerd/containerd.sock logs -f a729fdd387b0a
Flag --insecure-port has been deprecated,This flag will be removed in a future version.
I0416 20:32:25.730809       1 server.go:596] external host was not specified,using 10.6.0.9
I0416 20:32:25.744220       1 server.go:150] Version: v1.17.3
......
......
I0416 20:33:46.816189       1 dynamic_cafile_content.go:166] Starting request-header::/etc/kubernetes/pki/front-proxy-ca.crt
I0416 20:33:46.816832       1 dynamic_cafile_content.go:166] Starting client-ca-bundle::/etc/kubernetes/pki/ca.crt
I0416 20:33:46.833031       1 dynamic_serving_content.go:129] Starting serving-cert::/etc/kubernetes/pki/apiserver.crt::/etc/kubernetes/pki/apiserver.key
I0416 20:33:46.853958       1 secure_serving.go:178] Serving securely on [::]:6443
......
......
I0416 20:33:51.784715       1 log.go:172] http: TLS handshake error from 172.24.4.159:60148: EOF
I0416 20:33:51.786804       1 log.go:172] http: TLS handshake error from 172.24.4.159:60150: EOF
I0416 20:33:51.788984       1 log.go:172] http: TLS handshake error from 172.24.4.159:60158: EOF
I0416 20:33:51.790695       1 log.go:172] http: TLS handshake error from 172.24.4.159:60210: EOF
I0416 20:33:51.792577       1 log.go:172] http: TLS handshake error from 172.24.4.159:60214: EOF
I0416 20:33:51.793861       1 log.go:172] http: TLS handshake error from 172.24.4.159:60202: EOF
I0416 20:33:51.805506       1 log.go:172] http: TLS handshake error from 10.6.0.9:35594: EOF
I0416 20:33:51.806056       1 log.go:172] http: TLS handshake error from 172.24.4.159:60120: EOF
......

从 syslog 我可以看到 apiserver 服务证书是为 IP 172.24.4.159 签名的:

ubuntu@ubu1910-medflavor-nolb3-control-plane-nh4hf:~$ grep "apiserver serving cert is signed for DNS names" /var/log/syslog 
Apr 16 15:25:56 ubu1910-medflavor-nolb3-control-plane-nh4hf cloud-init[652]: [certs] apiserver serving cert is signed for DNS names [ubu1910-medflavor-nolb3-control-plane-nh4hf kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.6.0.9 172.24.4.159]

从 syslog 中我还可以看到由于“net/http: TLS 握手超时”,kubelet 服务无法访问 apiserver:

ubuntu@ubu1910-medflavor-nolb3-control-plane-nh4hf:~$ tail -F /var/log/syslog 
Apr 16 19:36:18 ubu1910-medflavor-nolb3-control-plane-nh4hf kubelet[1504]: E0416 19:36:18.596206    1504 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.RuntimeClass: Get https://172.24.4.159:6443/apis/node.k8s.io/v1beta1/runtimeclasses?limit=500&resourceVersion=0: net/http: TLS handshake timeout
Apr 16 19:36:19 ubu1910-medflavor-nolb3-control-plane-nh4hf containerd[568]: time="2021-04-16T19:36:19.202346090Z" level=error msg="Failed to load cni configuration" error="cni config load Failed: no network config found in /etc/cni/net.d: cni plugin not initialized: Failed to load cni config"
Apr 16 19:36:19 ubu1910-medflavor-nolb3-control-plane-nh4hf kubelet[1504]: E0416 19:36:19.274089    1504 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Apr 16 19:36:20 ubu1910-medflavor-nolb3-control-plane-nh4hf kubelet[1504]: W0416 19:36:20.600457    1504 status_manager.go:530] Failed to get status for pod "kube-apiserver-ubu1910-medflavor-nolb3-control-plane-nh4hf_kube-system(24ec7abb1b94172adb053cf6fdd1648c)": Get https://172.24.4.159:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-ubu1910-medflavor-nolb3-control-plane-nh4hf: net/http: TLS handshake timeout
Apr 16 19:36:24 ubu1910-medflavor-nolb3-control-plane-nh4hf containerd[568]: time="2021-04-16T19:36:24.336699210Z" level=error msg="Failed to load cni configuration" error="cni config load Failed: no network config found in /etc/cni/net.d: cni plugin not initialized: Failed to load cni config"
Apr 16 19:36:24 ubu1910-medflavor-nolb3-control-plane-nh4hf kubelet[1504]: E0416 19:36:24.379374    1504 controller.go:135] Failed to ensure node lease exists,will retry in 7s,error: Get https://172.24.4.159:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/ubu1910-medflavor-nolb3-control-plane-nh4hf?timeout=10s: context deadline exceeded
......
......

我也尝试用 curl 访问 apiserver,我看到:

ubuntu@ubu1910-medflavor-nolb3-control-plane-nh4hf:~$ curl http://172.24.4.159:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-ubu1910-medflavor-nolb3-control-plane-nh4hf
Client sent an HTTP request to an HTTPS server.

ubuntu@ubu1910-medflavor-nolb3-control-plane-nh4hf:~$ curl https://172.24.4.159:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-ubu1910-medflavor-nolb3-control-plane-nh4hf
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl Failed to verify the legitimacy of the server and therefore Could not
establish a secure connection to it. To learn more about this situation and
how to fix it,please visit the web page mentioned above.
ubuntu@ubu1910-medflavor-nolb3-control-plane-nh4hf:~$

kube-apiserver 的证书有问题吗?知道如何继续进行故障排除吗?

解决方法

如果您想查看 kube-api SSL 证书的详细信息,可以使用 curl -k -v https://172.24.4.159:6443openssl s_client -connect 172.24.4.159:6443

您没有提到您如何配置证书。 kubernetes 中的 SSL 是复杂的野兽,手动设置证书和所有通信可能非常痛苦。这就是人们现在使用 kubeadm 的原因。

TLDR:您必须确保所有证书均由 /etc/kubernetes/pki/ca.crt 签名。

既然你提到了“单节点”,我假设 Kubelet 在同一台服务器上作为 SystemD 单元运行? kube-api 容器是如何启动的?通过 Kubelet 进程本身,因为您在 /etc/kubernetes/manifests 中有 pod 定义?

kubeletkube-api 之间实际上有两种通信方式,并且它们同时使用:

  1. kubelet 使用来自 kube-api 参数的信息连接并验证到 --kubeconfig=/etc/kubernetes/kubelet.conf(您可以通过 ps -aux | grep kubelet 检查)。在文件中,您将看到连接字符串、CA 证书和客户端证书 + 密钥)。 Kubelet 提供来自文件的客户端证书,并由 CA 验证来自同一文件的 kube-api 服务器证书。 kube-api 使用在其自己的选项 --client-ca-file
  2. 中定义的 CA 验证客户端证书
  3. kube-api 使用 kubelet--kubelet-client-certificate 选项连接到 --kubelet-client-key。这可能不是问题所在。

因为您可以在 kube-api 端而不是在 kubelet 端看到 SSL 错误。我认为第 n.1 点中描述的通信存在问题。 kubelet 连接到 kube-api 并对其进行身份验证。错误在 kube-api 日志中,所以我想说 kube-api 在验证 kubelet 客户端证书时有问题。所以在--kubeconfig=/etc/kubernetes/kubelet.conf内检查它。您可以通过 openssl 或一些在线 SSL 证书检查器对其进行 base64 解码并显示详细信息。最重要的部分是它必须由 kube-api option --client-ca-file

中定义的 CA 文件签名

这一切都需要付出很多努力,老实说,您可以采取的最简单方法是扔掉所有东西并使用 kubeadm 来引导单节点集群:

  1. 清理你的服务器
  2. https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
  3. https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。