Wednesday, October 04, 2017

upgrading kubernetes - container pods stuck in state 'unknown'

I deleted an old pod that was sticking in our cluster without explanation and it turned into state 'unknown'. Getting logs from nodejs apps was impossible, in fact 'kubectl exec' hanged ssh sessions. I remember that I saw errors like these (pods reluctant to be deleted) when GKE was expecting a k8s upgrade. So I did and the issue got resolved.
# add temporary access from 0.0.0.0/0 (anywhere) to protected services pods connect to
# check cluster version
gcloud container clusters list
# switch to the specific project
gcloud config set project my-project
gcloud container clusters get-credentials my-project-cluster --zone us-east1-b --project my-project
# check available versions
gcloud container get-server-config
# upgrade cluster master. Note that you have to go up one minor version at a time, for example 1.5.7 needs to go up to 1.6.7 before being upgraded to 1.7.2
gcloud container clusters upgrade my-project-cluster --master --cluster-version=1.7.6-gke.1
# upgrade cluster nodes
gcloud container clusters upgrade my-project-cluster --cluster-version=1.7.6
# list instances external IPs
gcloud compute instances list
# remove access from old external pod IPs, add access to the new external IPs using CIDR /32
# remove temporary access from 0.0.0.0/0 (anywhere)

No comments:

Followers