kubeadm 在集群初始化时创建的一些内部证书,
其中 CA 根证书默认有效期10年, 普通服务器证书默认有效期为1年
于是大家总会遇到一些证书过期的情况;
官方这种设计的目的是为了鼓励大家及时更新集群版本, 因为在使用 kubeadm 升级集群版本的时候,
服务器证书也会自动进行一次续期。
但是很多场景下内网k8s集群可能搭建好了之后就好几年是不会更新的;
常见解决办法
- 纯二进制部署, 生成证书的时候设置一个非常长的时间
- kubeadm 部署, 但是定期几个月后去续期更新, 就是容易忘;
长期证书支持
在 v1.31+ 后 kubeadm 也支持生成长期证书了, 虽然默认还是1年, 但是初始化或续期的时候可以设置更长时间。
集群续期为长期证书实例
查看当前证书信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
# 查看版本
wait@ub05:~$ kubectl get node
NAME STATUS ROLES AGE VERSION
h21 Ready <none> 86d v1.33.3
h22 Ready control-plane 109d v1.33.3
h23 Ready <none> 109d v1.33.3
# 查看证书有效期
# 可以看到根证书还有9年, 普通证书目前还有255天
root@h22:~# kubeadm certs check-expiration
[check-expiration] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[check-expiration] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W1112 06:33:53.477114 1594017 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" not found
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Jul 25, 2026 17:13 UTC 255d ca no
apiserver Jul 25, 2026 17:13 UTC 255d ca no
apiserver-kubelet-client Jul 25, 2026 17:13 UTC 255d ca no
controller-manager.conf Jul 25, 2026 17:13 UTC 255d ca no
front-proxy-client Jul 25, 2026 17:13 UTC 255d front-proxy-ca no
scheduler.conf Jul 25, 2026 17:13 UTC 255d ca no
super-admin.conf Jul 25, 2026 17:13 UTC 255d ca no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Jul 23, 2035 17:13 UTC 9y no
front-proxy-ca Jul 23, 2035 17:13 UTC 9y no
|
master 节点升级
准备一个证书配置文件
1
2
3
4
|
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
certificateValidityPeriod: 43800h # 此处设置为5年新证书有效期
caCertificateValidityPeriod: 87600h
|
执行证书更新 - 然后重启控制平面组件
1
|
kubeadm certs renew all --config kubeadm-cert.yaml
|
查看当前的证书有效期, 可以看到新更新的服务器证书都是5年有效期
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
root@h22:~# kubeadm certs check-expiration
[check-expiration] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[check-expiration] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W1112 06:38:17.906072 1594209 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" not found
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Nov 11, 2030 06:36 UTC 4y364d ca no
apiserver Nov 11, 2030 06:36 UTC 4y364d ca no
apiserver-kubelet-client Nov 11, 2030 06:36 UTC 4y364d ca no
controller-manager.conf Nov 11, 2030 06:36 UTC 4y364d ca no
front-proxy-client Nov 11, 2030 06:36 UTC 4y364d front-proxy-ca no
scheduler.conf Nov 11, 2030 06:36 UTC 4y364d ca no
super-admin.conf Nov 11, 2030 06:36 UTC 4y364d ca no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Jul 23, 2035 17:13 UTC 9y no
front-proxy-ca Jul 23, 2035 17:13 UTC 9y no
|
重启控制面静态 pod
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
crictl pods --namespace kube-system --name 'kube-scheduler-*|kube-controller-manager-*|kube-apiserver-*|etcd-*' -q | xargs crictl rmp -f
root@h22:~# crictl pods --namespace kube-system
POD ID CREATED STATE NAME NAMESPACE ATTEMPT RUNTIME
218bdea9630ba 12 seconds ago Ready kube-apiserver-h22 kube-system 0 (default)
1fbdd48b11317 12 seconds ago Ready kube-controller-manager-h22 kube-system 0 (default)
2e4a6e66cd61a 12 seconds ago Ready kube-scheduler-h22 kube-system 0 (default)
c522cb6428a6f 3 months ago Ready cilium-operator-6745cd9b78-z8ssj kube-system 0 (default)
d8f8a581dd657 3 months ago Ready cilium-envoy-8jvxh kube-system 0 (default)
ed13af2891c7e 3 months ago Ready cilium-2hh5n
systemctl restart kubelet
# 查看证书有效期
root@h22:~# openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text | grep Not
Not Before: Nov 12 06:31:43 2025 GMT
Not After : Nov 11 06:36:43 2030 GMT
|
work 节点
/etc/kubernetes/pki/ca.crt 只有这一个 ca 根证书, 不需要更新
kubelet 客户端证书自动轮换机制
默认有效期 1 年, 由 kubelet 自己定期向 CSR API 申请轮换
需要开启 –rotate-certificates=true 和 –rotate-kubelet-client-certificate=true 参数
或 rotateCertificates: true 配置
kubelet 会在证书剩余 30% 寿命时自动重新生成私钥并提交 CSR,
Controller Manager 批准后会自动下发新证书, 无需人工干预
此处考虑提前手工续期
1
2
3
4
5
|
# 检查证书有效期
openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -dates
notBefore=Jul 25 17:50:07 2025 GMT
notAfter=Jul 25 17:50:07 2026 GMT # 未到期, 不会触发自动续期
|
网上的说法是删除旧证书, 然后重启 kubelet 时会自动颁发新的, 实测不可以
实际上还需要准备一个 bootstrap-kubeconfig 文件
删除旧的kubelet证书文件
1
2
3
4
5
6
|
rm -f /var/lib/kubelet/pki/kubelet-client*
systemctl restart kubelet
# 查看证书颁发
# 若 1~2 分钟内出现 Approved/Issued 的新 CSR, 说明自动轮换 OK
kubectl get csr -w
|
此时看work节点有错误日志, 提示既找不到旧证书, 也找不到 bootstrap-kubeconfig
1
2
3
4
5
6
7
8
9
|
Nov 12 06:59:28 h23 systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 21.
Nov 12 06:59:28 h23 systemd[1]: Started kubelet.service - Kubernetes Kubelet.
Nov 12 06:59:28 h23 kubelet[1733209]: I1112 06:59:28.696569 1733209 server.go:530] "Kubelet version" kubeletVersion="v1.33.3"
Nov 12 06:59:28 h23 kubelet[1733209]: I1112 06:59:28.696614 1733209 server.go:532] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Nov 12 06:59:28 h23 kubelet[1733209]: I1112 06:59:28.696752 1733209 server.go:956] "Client rotation is on, will bootstrap in background"
Nov 12 06:59:28 h23 kubelet[1733209]: E1112 06:59:28.697022 1733209 bootstrap.go:241] "Unhandled Error" err="unable to read existing bootstrap client config from /etc/kubernetes/kubelet.conf: invalid configuration: [unable to read client-cert /var/lib/kubelet/pki/kubelet-client-current.pem for default-auth due to open /var/lib/kubelet/pki/kubelet-client-current.pem: no such file or directory, unable to read client-key /var/lib/kubelet/pki/kubelet-client-current.pem for default-auth due to open /var/lib/kubelet/pki/kubelet-client-current.pem: no such file or directory]" logger="UnhandledError"
Nov 12 06:59:28 h23 kubelet[1733209]: E1112 06:59:28.698145 1733209 run.go:72] "command failed" err="failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory"
Nov 12 06:59:28 h23 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Nov 12 06:59:28 h23 systemd[1]: kubelet.service: Failed with result 'exit-code'.
|
看来直接删除证书文件的方式不可行, 于是采用临时办法进行抢救
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
root@h22:~# kubeadm token create --ttl=24h --print-join-command
kubeadm join 192.168.5.22:6443 --token wo2zzp.su66fv01zssxafx9 \
--discovery-token-ca-cert-hash sha256:6e104c8b9b24ee8a34157dcbb7f83d484a5e46ef7a46a57832b5a6a5f6054c5a
在失败的 Worker 上生成 bootstrap-kubelet.conf
#!/usr/bin/env bash
set -e
API_SERVER="https://192.168.5.22:6443" # 换成你的 VIP 或 LB 地址
TOKEN="wo2zzp.su66fv01zssxafx9" # 上一步拿到的 token
CA_HASH="sha256:6e104c8b9b24ee8a34157dcbb7f83d484a5e46ef7a46a57832b5a6a5f6054c5a" # 上一步拿到的 hash
# 生成临时 kubeconfig 文件
kubectl config --kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf \
set-cluster kubernetes \
--server="$API_SERVER" \
--certificate-authority=/etc/kubernetes/pki/ca.crt \
--embed-certs=true
kubectl config --kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf \
set-credentials kubelet-bootstrap \
--token="$TOKEN"
kubectl config --kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf \
set-context kubelet-bootstrap@kubernetes \
--cluster=kubernetes \
--user=kubelet-bootstrap
kubectl config --kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf \
use-context kubelet-bootstrap@kubernetes
chmod 600 /etc/kubernetes/bootstrap-kubelet.conf
systemctl restart kubelet
|
最终的 bootstrap-kubelet.conf 文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: xxxxx
server: https://192.168.5.22:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubelet-bootstrap
name: kubelet-bootstrap@kubernetes
current-context: kubelet-bootstrap@kubernetes
kind: Config
preferences: {}
users:
- name: kubelet-bootstrap
user:
token: wo2zzp.su66fv01zssxafx9
|
可以看到已经自动批准 + 签发了新证书
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
wait@ub05:~$ kubectl get csr -w
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-2gmzv 24s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:wo2zzp <none> Approved,Issued
wait@ub05:~$ kubectl get node
NAME STATUS ROLES AGE VERSION
h21 Ready <none> 86d v1.33.3
h22 Ready control-plane 109d v1.33.3
h23 Ready <none> 109d v1.33.3
# 查看新证书有效期
# openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -dates
notBefore=Nov 12 06:59:59 2025 GMT
notAfter=Nov 12 06:59:59 2026 GMT
|