As an agile open source project, Kubernetes continues to evolve, as does the cloud computing landscape. Keeping up with the latest versions isn’t practical for many organizations, and there are good reasons not to keep up with the very latest version, particularly in the first few weeks after a release. Nevertheless, it’s not a great idea to get too far behind, not only because you may miss out on important security, compatibility, and performance updates, but also because support for older versions ends.
If you’re using Amazon Elastic Kubernetes Service (EKS), for example, standard support for each minor version usually has a defined lifecycle window, including a standard support period and, if enabled, extended support at an additional cost. Always verify your current version’s dates in the EKS Kubernetes version lifecycle documentation.
EKS is a managed Kubernetes service from Amazon that many organizations use to deploy, manage, and scale containerized applications. This guide walks through the steps you’ll need to take to upgrade your EKS clusters. It includes guidance on when and how to complete these upgrades as well as tools that can make it easier for you to upgrade safely and securely.
The Kubernetes community follows an approximate N-2 support policy, which means that they provide security fixes and bug patches for the three most recent minor versions. They release new minor versions roughly 3 times a year, and in Amazon EKS, each Kubernetes minor version is under standard support for 14 months after release. After that, the minor release enters extended support for the next 12 months at an additional cost per cluster hour. This gives you a total of 26 months of support per minor version.
Once the extended support period ends, your EKS control plane is automatically upgraded to the earliest Kubernetes version that’s still supported, while your node groups and add-ons remain on their existing versions until you update them. This scenario is far from ideal, because it gives you little control over the timing of the upgrade and you’ll still need to coordinate node group and add-on upgrades soon after.
Before each upgrade cycle, review the EKS Kubernetes version lifecycle documentation to see which versions are in standard support, which are in extended support, and when support ends for each version.
For most organizations, expect to assess new Kubernetes releases regularly. Many teams manage multiple versions in different environments. For example, you may test out a new version in your development environment for at least a week or two, and follow that process for test and staging environments. Before pushing a new version to production, make sure you have at least a week of data from staging, so you know you won’t run into unexpected snags on go-live.
Each Kubernetes version includes the control plane and the data plane; make sure that both your control plane and your data plane are running the same Kubernetes minor version whenever possible. Kubernetes allows some skew between versions, but support varies by Kubernetes component and for different cluster development tools. Amazon EKS also has its own limits on version skew between the control plane and managed node groups or Fargate nodes, so review the EKS upgrade best practices before planning your rollout.
For upgrade purposes, this is the correct order. However, we recommend that your dev/stage/test clusters all look as close to production as possible for typical day-to-day operations so upgrade tests accurately reflect production behavior.
You'll want to upgrade your development environment first. This ensures you are keeping up with the latest K8s updates. If you encounter critical issues with the latest version, you can identify problems quickly and find solutions before pushing the latest EKS version to staging.
The next environment to upgrade is often your staging environment. This is where any remaining issues that haven't been fixed in the development environment should be caught. This is the last step before production, so it's often best to allow a "soak time" for changes here — at Fairwinds, that’s typically one to two weeks.
The goal is to keep your staging version aligned as closely to production as possible. This makes your developers’ lives easier because they don’t need to worry about maintaining code for too many versions. After the agreed upon "soak time," there should be very little risk in upgrading the production environment, so upgrade it promptly. Don't fall into the trap of not completing the upgrade cycle because you're worried about moving it to production.
Some practitioners recommend not installing the latest minor version until at least patch .2. In other words, they might suggest waiting to install the latest Kubernetes version, such as 1.30.0, until 1.30.2 is available, especially for mission critical workloads. From there, you can begin the upgrade process, moving from dev to staging and then to production as usual.
This recommendation stems from years of experience — by the .2 version, extensive testing is complete and many major issues have already been discovered and resolved. Often, once you have completed the dev upgrade and rolled it out to staging, the .3 release is available. Treat this as a rule of thumb rather than a hard rule, and balance it against the 14‑month standard‑support window and 26‑month total lifecycle for each EKS Kubernetes version.
EKS customers are responsible for initiating upgrades for the cluster control plane and data plane. While AWS handles control plane upgrades, you are responsible for the data plane, including managed node groups, Fargate pods, self‑managed node groups, and add‑ons.
EKS supports in-place cluster upgrades, which preserve resource and configuration consistency. It minimizes user disruption and retains information about existing workloads and resources. You can only upgrade one minor version at a time.
If you need to make multiple version updates, you’ll have to do sequential upgrades. This approach can increase the risk of downtime, so plan your upgrade windows early within each version’s lifecycle. Consider evaluating a blue/green cluster upgrade strategy in this case, where one environment (blue) runs the current Kubernetes version and another environment (green) runs the new Kubernetes version.
AWS manages the EKS control plane upgrade process to ensure a seamless transition from one Kubernetes version to the next. These are the steps AWS goes through to upgrade the EKS control plane:
To upgrade an EKS cluster, we recommend you go through the following steps:
EKS Kubernetes version documentation provides a detailed list of changes for each version, which you should use to build a checklist for each upgrade. For guidance on specific EKS version upgrades, check the documentation to identify important changes and considerations for each version.
Before you begin a cluster upgrade, make sure you understand what versions of Kubernetes components are in use. Inventory your cluster components, particularly the ones that interact with the Kubernetes API directly. Your typical cluster includes multiple workloads that rely on the Kubernetes API, which provide important functionalities.
These cluster components typically include:
Make sure you check for any other workloads or add-ons that interact directly with the Kubernetes API. You can sometimes identify critical cluster components by looking at namespaces that end in *-system. Next, refer to the documentation of those critical components to evaluate version compatibility and whether there are any prerequisites for upgrading. Some components may require you to make updates or adjust your configuration before you upgrade your cluster.
Here are some common add-ons (linked to upgrade documentation):
Some add-ons, such as the VPC CNI plugin and kube-proxy, can be installed via Amazon EKS Add-ons, which provides an alternative to add-on management through the EKS API. You might consider managing those addons this way, as this approach enables you to update add-on versions with a single command. For example:
aws eks update-addon —cluster-name my-cluster —addon-name vpc-cni —addon-version version-number \
--service-account-role-arn arn:aws:iam::111122223333:role/role-name —configuration-values '{}' —resolve-conflicts PRESERVE
To check whether you have any EKS Add-ons, type:
aws eks list-addons --cluster-name <cluster name> --output table
— — — — — — — — —
| ListAddons |
+----------------+
|| addons ||
|+--------------+|
|| coredns ||
|| kube-proxy ||
|| vpc-cni ||
|+--------------+|
Note: Amazon does not automatically upgrade EKS Add-ons during a control plane upgrade. You must initiate EKS add-on updates and select the version you want to update to. Make sure you pick a compatible version from all available versions using this guidance on add-on version compatibility. Remember, you can only upgrade Amazon EKS Add-ons one minor version at a time, subject to the current compatibility rules in the EKS documentation.
AWS requires several specific resources in your account to upgrade a control plane, including:
CLUSTER=<cluster name>
aws ec2 describe-subnets --subnet-ids \
$(aws eks describe-cluster --name ${CLUSTER} \
--query 'cluster.resourcesVpcConfig.subnetIds' \
--output text) \
--query 'Subnets[*].[SubnetId,AvailabilityZone,AvailableIpAddressCount]' \
--output table
----------------------------------------------------
| DescribeSubnets |
+---------------------------+--------------+-------+
| subnet-0ce25bacdb030ce4f | us-west-2a | 8136 |
| subnet-0c173097d592e96e4 | us-west-2c | 8051 |
| subnet-06a36d93ad471d420 | us-west-2b | 8127 |
+---------------------------+--------------+-------+
The cloud native ecosystem continues to expand and mature, so it’s unsurprising that there are a lot of open source tools available to help teams navigate Kubernetes upgrades. Here are a few options you can use to help you with your EKS upgrades, with some examples and descriptions.
Pluto is an open source tool from Fairwinds that looks for the use of deprecated apiVersions. Pluto supports scanning a live cluster, manifest files, and helm charts. It also provides a GitHub Action that you can include in your CI process. Pluto will tell you whether you can upgrade safely against API paths, checking to see whether you are calling deprecated or removed API paths in your configuration or Helm charts. You can run Pluto against local files using the command:
pluto detect-files
You can also check Helm using the command:
pluto detect-helm -owide
It’s pretty easy to add this to CI; this is helpful for people who manage many clusters.
$ pluto detect-all-in-cluster -o wide 2>/dev/null
NAME NAMESPACE KIND VERSION REPLACEMENT DEPRECATED DEPRECATED IN REMOVED REMOVED IN
testing/viahelm viahelm Ingress networking.k8s.io/v1beta1 networking.k8s.io/v1 true v1.19.0 true v1.22.0
webapp default Ingress networking.k8s.io/v1beta1 networking.k8s.io/v1 true v1.19.0 true v1.22.0
eks.privileged PodSecurityPolicy policy/v1beta1 true v1.21.0 false v1.25.0
This combines all available in-cluster detections, showing results from Helm releases and API resources.
NAME KIND VERSION REPLACEMENT REMOVED DEPRECATED REPL AVAIc
eks.privileged PodSecurityPolicy policy/v1beta1 false true true
Once you identify which workloads and manifests need updating, you may find that you need to change the resource version in your manifest files (for example, change networking.k8s.io/v1beta1 to networking.k8s.io/v1). This may require you to update the resource specification as well. You may need to do additional research, depending on which resource you are replacing.
If a resource type is remaining the same and only the API version needs to be updated, you can use the kubectl-convert command to convert your manifest files automatically. For example, if you want to convert an older Deployment to apps/v1, type the command:
kubectl-convert -f <file> --output-version <group>/<version>
Refer to install kubectl convert plugin on the Kubernetes website if you would like more information.
Nova is another open source utility from Fairwinds that helps you check your Helm releases to see if there are upgrades needed. Typically, the CNI and other dependencies are installed with Helm. Nova is a fast method you can use to ensure you are running the latest version. As always, check the patch notes to verify support for the version you are targeting.
Install the golang binary and run it against your cluster.
$ go install github.com/fairwindsops/nova@latest
$ nova find
Release Name Installed Latest Old Deprecated
============ ========= ====== === ==========
cert-manager v0.11.0 v0.15.2 true false
insights-agent 0.21.0 0.21.1 true false
grafana 2.1.3 3.1.1 true false
metrics-server 2.8.8 2.11.1 true false
nginx-ingress 1.25.0 1.40.3 true false
To check for outdated container images instead of helm releases:
$ nova find --container
Container Name Current Version Old Latest Latest Minor Latest Patch
============== =============== === ====== ============= =============
k8s.gcr.io/coredns/coredns v1.8.0 true v1.8.6 v1.8.6 v1.8.6
k8s.gcr.io/etcd 3.4.13-0 true 3.5.3-0 3.4.13-0 3.4.13-0
k8s.gcr.io/kube-apiserver v1.21.1 true v1.23.6 v1.23.6 v1.21.12
k8s.gcr.io/kube-controller-manager v1.21.1 true v1.23.6 v1.23.6 v1.21.12
k8s.gcr.io/kube-proxy v1.21.1 true v1.23.6 v1.23.6 v1.21.12
k8s.gcr.io/kube-scheduler v1.21.1 true v1.23.6 v1.23.6 v1.21.12
Officially called KubePug/Deprecations, this open source tool is designed to help users evaluate the health and performance of their K8s clusters. It functions as a kubectl plugin and includes these capabilities:
Run the following command to install kubepug as a Krew plugin:
kubectl krew install deprecations
eksup is a command line interface (CLI) that is designed to provide users with comprehensive information and tools to prepare clusters for an upgrade. It can help streamline the upgrade process by providing relevant insights and actions.
A CLI to aid in upgrading Amazon EKS clusters
Usage: eksup <COMMAND>
Commands:
analyze Analyze an Amazon EKS cluster for potential upgrade issues
create Create artifacts using the analysis data
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
GoNoGo is another open source tool from Fairwinds. It helps you define and discover whether an add-on installed with Helm is safe to upgrade.
gonogo --help
The Kubernetes Add-On Upgrade Validation Bundle is a spec that can be used to define and then discover if an add-on upgrade is safe to perform.
Usage:
gonogo [flags]
gonogo [command]
Available Commands:
check Check for Helm releases that can be updated
completion Generate the autocompletion script for the specified shell
help Help about any command
version Prints the current version of the tool.
Flags:
-h, --help help for gonogo
-v, --v Level number for the log level verbosity
Use "gonogo [command] --help" for more information about a command.
Another community supported open source tool you can use is Velero, which enables you to create backups of existing clusters and then apply the backups to a new cluster. AWS resources, including IAM, are not included in a Velero backup, so you will need to recreate them.
To make sure your workloads remain available during a data plane upgrade, you need to configure PodDisruptionBudgets and topologySpreadConstraints appropriately. Keep in mind that not all workloads demand the same level of availability, so assess your workload’s scale and requirements.
If workloads are distributed across multiple Availability Zones and hosts with topology spreads, that improves the likelihood that migrations to the new data plane will happen without disruptions.
This is an example of a workload configuration that guarantees 80% of replicas are consistently available, spreading replicas across zones and hosts efficiently:
# Source: basic-demo/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: demo-basic-demo
labels:
app.kubernetes.io/name: basic-demo
app.kubernetes.io/instance: demo
spec:
selector:
matchLabels:
app.kubernetes.io/name: basic-demo
app.kubernetes.io/instance: demo
template:
metadata:
labels:
app.kubernetes.io/name: basic-demo
app.kubernetes.io/instance: demo
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
- maxSkew: 1
topologyKey: zone
whenUnsatisfiable: DoNotSchedule
containers:
- name: basic-demo
image: "quay.io/fairwinds/docker-demo:1.2.0"
imagePullPolicy: Always
ports:
- name: http
containerPort: 8080
protocol: TCP
# Source: basic-demo/templates/pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: demo-basic-demo
spec:
minAvailable: 80%
selector:
matchLabels:
app.kubernetes.io/name: basic-demo
app.kubernetes.io/instance: demo
The AWS Resilience Hub now includes EKS as a supported resource. This provides a single place where you can define, validate, and track the resilience of your applications. This helps you avoid unnecessary downtime caused by infrastructure, software, or operational disruptions.
Managed Node Groups and Karpenter both simplify node upgrades, taking different approaches. Managed node groups automate node provisioning and lifecycle management, which means you can create, automatically update, or terminate nodes with a single operation.
Karpenter creates new nodes automatically using the latest compatible EKS Optimized Amazon Machine Image (AMI). When EKS releases updated EKS Optimized AMIs or you upgrade the cluster, Karpenter starts using these images automatically. It also uses Node Expiry to update nodes. You can configure Karpenter to use custom AMIs, but keep in mind that if you do, you’re responsible for the version of kubelet.
Self-managed node groups are Amazon Elastic Compute Cloud (EC2) instances that were deployed in your account and attached to the cluster outside of the EKS service. Usually, these node groups are deployed and managed by some form of automation tooling, such as eksctl, kOps, and EKS Blueprints. Refer to your tools’ documentation to upgrade self-managed node groups.
Unsurprisingly, new versions of Kubernetes introduce significant changes to your Amazon EKS cluster. Remember that once you upgrade a cluster, you can’t downgrade it. And you can only create new clusters for Kubernetes versions that are currently supported by EKS. If you are concerned about this risk, you may want to consider backing up the cluster before an upgrade.
Although you may feel like you only have time to focus on the current version of Kubernetes, it’s important to monitor for new releases and identify significant changes. For example, the most important change for migrating from 1.23 to 1.24 was the removal of the Dockershim from the kubelet. Dockershim was an adapter of sorts between Kubernetes and Docker.
This code, embedded in the kubelet to allow the kubelet to talk to the docker daemon (even though the docker daemon was not compliant with the Open Container Initiative (OCI)) was removed in 1.24. This means the kubelet now communicates directly with the container runtime using the container runtime interface (CRI) when launching and managing containers on the nodes. EKS AMIs only have containerd as the runtime as of version 1.24. Preparing for substantial changes like these requires additional time and planning.
Review all of the documented modifications for the version you plan to upgrade to, noting any required upgrade procedures. Make sure you also pay attention to requirements or processes tailored specifically to Amazon EKS managed clusters (check the Kubernetes changelog and the EKS Kubernetes version release notes for your target version). This approach will help you have a smoother upgrade process and minimize potential disruptions.
Below is a list of some of the most well-known changes (many of which are breaking) in Kubernetes versions, starting with v1.24. This is not a complete list; always refer to the upstream release notes and the EKS Kubernetes version documentation for any version you plan to run.
Kubernetes v1.24
Kubernetes v1.25
Kubernetes v1.26
Kubernetes v1.27
Kubernetes v1.28
Kubernetes v1.29
Kubernetes v1.30
Kubernetes v1.31 and later
Before targeting any of these versions on EKS, review the official Kubernetes release notes and the EKS “Kubernetes versions on standard support” page to identify breaking changes, deprecations, and EKS‑specific considerations for your target minor version.
Hopefully, the information outlined in this guide is useful to you. Consistently upgrading Kubernetes requires research and effort; you need to ensure that you have time to test your environments with each minor release. If you follow these steps, you should be in good shape to undertake upgrading EKS clusters. If you need help with your next EKS upgrade, reach out. Our team has the Kubernetes expertise to make the upgrade easy for you and make your Kubernetes infrastructure more efficient at the same time, saving you time and money.
Originally published April 22, 2024 and updated to reflect changes.