KB: Fixing Kubernetes Cluster after expired certs

Fixing Kubernetes API Server and Kubelet Issues After Expired Certificates

  1. Backup Existing Kubernetes Configuration

    • Create a backup of Kubernetes certificates and configuration files
    • Copy /etc/kubernetes/pki/ and /etc/kubernetes/*.conf to a backup directory
  2. Renew Expired Certificates

    • Check expired certificates using kubeadm certs check-expiration
    • Renew all certificates with kubeadm certs renew all
    • Restart kubelet and container runtime using systemctl restart kubelet and systemctl restart containerd or systemctl restart docker
  3. Regenerate bootstrap-kubelet.conf if Missing

    • Run kubeadm init phase kubelet-start to regenerate it
    • If still missing, manually copy admin.conf to kubelet.conf
    • Restart kubelet using systemctl restart kubelet
  4. Fix Unauthorized Errors

    • Ensure kubectl is using the correct kubeconfig
    • Copy /etc/kubernetes/admin.conf to ~/.kube/config and set proper ownership and permissions
    • Check kubelet logs for authentication errors with journalctl -u kubelet --no-pager | tail -50
  5. Use crictl to Check and Troubleshoot Running Containers

    • Check running containers: crictl ps
    • Check all containers including stopped ones: crictl ps -a
    • View logs of a specific container: crictl logs <container-id>
    • Inspect a container: crictl inspect <container-id>
    • If the API server container is missing, restart kubelet and check logs
  6. Rejoin the Master Node if Needed

    • Create a new bootstrap token using kubeadm token create --print-join-command
    • Run the generated kubeadm join command to reconnect the node
    • Restart kubelet using systemctl restart kubelet
  7. Final Checks

    • Verify Kubernetes nodes are running using kubectl get nodes
    • Check API server status with kubectl get pods -n kube-system
    • Review kubelet logs with journalctl -u kubelet --no-pager | tail -50
    • Ensure all control plane components are running using kubectl get componentstatuses

This process ensures Kubernetes master nodes recover from expired certificates, missing kubelet configuration, and API server failures.

Comments

Popular posts from this blog

KB: Azure ACA Container fails to start (no User Assigned or Delegated Managed Identity found for specified ClientId)

Electron Process Execution Failure with FSLogix

KB:RMM VS DEX (Remote Monitoring Management vs Digital Employee Experience)