Kubernetes is the undisputed standard for cloud native deployments, requiring platform engineers and developers alike to understand the complexities that come along with it. Kubernetes offers a lot of flexibility, allowing users to adjust it to meet the unique business needs of their organization. But with that flexibility comes potential risk, and it can be difficult to know how to build apps and services in the best way. Anyone who uses or manages Kubernetes needs to have a basic understanding of these fourteen policies to use it efficiently, reliably, and securely.

Efficiency

Kubernetes manages a cluster of nodes, and each one has limited resources, including CPU, memory, and storage. It’s important to ensure that each of your applications has sufficient resources. Optimizing your resource usage can help you avoid over-provisioning and improve cost effectiveness, while also ensuring scalability without compromising performance.

Make sure you understand and follow these four best practices to make sure resource requests are set appropriately.

1. CPU Requests Missing

If you don’t set your CPU requests, a single pod can consume all of the available CPU resources on a node. This may cause another pod to become starved for resources. By setting your resource requests appropriately, the reliability of your apps and services increases because it guarantees that the pod will have access to the resources it needs.

2. CPU Limits Missing

Setting CPU limits is also important, because without them, pods can consume more CPU resources than they need, potentially starving other pods of the CPU resources required. This can lead to resource contention, performance degradation, and application slowdowns. Without CPU limits, it can be difficult to troubleshoot performance issues or identify resource bottlenecks.

3. Memory Requests Missing

It’s also important to set memory requests; when set too low, it becomes difficult to ensure application reliability. When set too high, you are paying for unused memory, which can increase your cloud computing costs. It can also make it difficult for the Kubernetes scheduler to identify suitable nodes to place pods, resulting in scheduling delays, inefficient resource allocation, and potential resource contention. This can also artificially limit the cluster’s capacity and make it more difficult to scale applications efficiently.

4. Memory Limits Missing

Similar to missing CPU limits, missing memory limits allow pods to consume more memory resources than needed, potentially starving other pods of the memory resources they require and leading to resource contention, performance degradation, and application slowdowns. If applications consume excessive memory resources, they may exceed the available memory on a node, leading to OOM (out-of-memory) errors and causing applications to crash abruptly. It may also affect other pods on the same node.

Creating and adhering to policies that check that these requests and limits have been set is an important way to ensure efficient resource utilization and improve the efficiency and performance of applications.

Check out more about efficiency policies available in Polaris.

Reliability

Reliability is a critical aspect of Kubernetes deployments, ensuring that applications are consistently available, performant, and resilient to failures. Ensuring that you are aligning to the following five policies help you make sure your workloads are always available and reliable.

5. Missing Liveness Probes

Liveness probes indicate whether the container is running or not. If a liveness probe moves into a failing state, Kubernetes automatically sends a signal to restart the container to try to restore the service to an operational state. If containers in the pod do not have liveness probes, faulty or non-functioning pods continue to run indefinitely, using valuable resources unnecessarily. This could cause application errors.

6. Missing Readiness Probes

Readiness probes determine that an application has reached a "ready" state. Often, there’s a period of time between when a web server process starts and when it is ready to receive traffic. Readiness probes can ensure the traffic is not sent to a pod until it is actually ready to receive that traffic. A policy that checks for missing readiness probes is one way to improve overall reliability.

7. Image Pull Policy not Always

Relying on cached versions of a Docker image can expose you to security vulnerabilities. By default, an image will be pulled, but only if it isn't already cached on the node attempting to run it. That means you could have multiple versions of an image running per node or you could be using a vulnerable image. It’s usually best to ensure that a pod has pullPolicy: Always specified. This ensures that images are always pulled directly from their source and that you’re always using the most up to date version of the image. This policy can enhance reliability by reducing the risk of running outdated or corrupted images, which can lead to application failures or unexpected behavior.

8. Missing Replicas for Deployments

Deployments are designed to maintain a specified number of replicas to help maintain the stability and high availability of applications. Without replicas in place, the ability for applications to handle traffic and requests is limited. A policy that checks to make sure replicas are available for deployment can help minimize downtime and potential service disruptions resulting from insufficient replicas.

9. Missing Pod Disruption Budget

Pod disruption budgets (PDBs) define limits on the number of pods that can be disrupted during deployments, upgrades, or other maintenance operations. By enforcing PDBs, you can prevent widespread application disruptions and ensure that critical services remain available even during planned maintenance activities. Having a policy that checks for missing pod disruption budgets can help you ensure the stability and reliability of your applications.

Check out more policies related to reliability available in Polaris.

Security

Implementing policy security checks is critical for safeguarding your Kubernetes deployments against unauthorized access, vulnerabilities, and potential breaches. Here are some of the most important policy security checks to understand and implement in your Kubernetes environment.

10. Insecure Capabilities

Some Linux capabilities are enabled for Kubernetes workloads by default, even though most workloads don’t need these capabilities. This can potentially compromise the security of the cluster and the applications running within it by allowing containers to perform actions beyond their intended scope, increasing the attack surface and exposing the system to potential exploits. A policy that checks to ensure that you are not granting containers unnecessary capabilities can help you reduce security risks.

11. Writeable File Systems

Kubernetes workloads do not set readOnlyRootFilesystem to true by default, which means it is necessary to explicitly change this setting to increase your security in Kubernetes. The setting prevents a container from writing to its filesystem, thereby preventing malicious actors from tampering with an application or writing foreign executables to disk. Identifying whether this is allowed is an important way to improve security.

12. Privilege Escalation Allowed

In some configurations, a container may be able to escalate its privileges. This can introduce significant security risks and increase the likelihood of cyberattacks. Because the privilegeEscalationAllowed setting is not set to false by default, having a policy that checks for it is vital to understanding what privileges you are allowing and when.

13. Run as Privileged

Similar to the previous policy check, runAsPrivileged determines whether any container in a pod can enable privileged mode. By default, a container does not have permission to access any devices on the host; however, a privileged container has access to all devices on the host. This allows the container nearly the same access as processes running on the host, which can be useful for containers that need to manipulate the network stack or access devices, but is excessive for other containers. Restricting the number of containers with the ability to run as privileged is another way to manage security effectively.

14. Run as Root Allowed

Running containers as root is not secure. In fact, the National Security Agency’s Kubernetes Hardening Guide advises organizations to use containers that have been built to run applications as non-root users. Unfortunately, many container services run as the privileged root users, despite not requiring privileged execution. Following a policy that identifies which containers are allowed to run as root and minimizing the number of containers that have that capability can help you improve security in Kubernetes.

Check out more policies related to security available in Polaris.

Understand and Apply Kubernetes Best Practices

One of the cornerstones of the effective adoption of Kubernetes is putting policies in place that govern how Kubernetes is configured, managed, and secured. This enables consistency, predictability, and repeatability and helps developers and platform engineering teams ensure that they are able to deploy applications more frequently without putting efficiency, reliability, and security at risk.

Read A Platform Engineers Guide to Kubernetes to learn how to automate Kubernetes best practice enforcement.

14 Kubernetes Policies Every User Needs to Know & Understand