At Fairwinds, we’ve managed hundreds of Kubernetes clusters across dozens of organizations. In our efforts to keep our customers’ environments running securely, efficiently, and reliably, we’ve found a lot of benefit in leveraging a wide range of open source auditing tools, like Trivy (for container scanning) and kube-hunter (for network sniffing), as well as our own contributions, like Polaris (for configuration validation) and Goldilocks (for auditing resource requests and limits). We built Fairwinds Insights as a way of managing, tracking, and triaging all the disparate findings of these tools, and it’s been a big help.
But these tools are generally one-size-fits-all — they check for things that everyone running Kubernetes should care about. Polaris, for instance, runs a number of checks for “best practices,” like ensuring that you’ve set memory limits for each of your workloads.
Many organizations have more specific requirements. For instance, they might:
be very security conscious, and want every container to drop all Linux capabilities
be sensitive to Docker registry outages, and want to ensure that every image is being pulled from their internal registry
have an internal naming scheme they want to enforce
want particular labels, like a cost center code, to be attached to every workload
need every Deployment to have a vertical or horizontal pod-autoscaler attached
While organizations could use Polaris’ custom check syntax to enforce many of these policies, the Kubernetes community has settled on a powerful open standard for creating configuration polices: Open Policy Agent, or OPA. We’re excited to announce that we’ve added support for OPA policies to every part of Fairwinds Insights, including CI/CD pipelines, the admission controller, and the in-cluster agent.
By using Fairwinds Insights and OPA together, organizations can proactively align their Kubernetes clusters with both best practices and internal policies. Furthermore, the ability to run OPA at each step in the development and deployment process helps surface issues early on, before they make it into the cluster, leading to an easier hand-off between Dev and Ops.
OPA is a framework for validating structured data. It nudges users toward writing policy-as-code, extending the community’s successful push toward infrastructure-as-code. While OPA can validate any kind of structured data, including Terraform, HTTP requests, and Dockerfiles, it is most often thought of in conjunction with Kubernetes manifests.
The folks responsible for Open Policy Agent maintain a whole suite of tools for configuration validation, but the core technology is a domain-specific language known as Rego.
Rego can be a bit daunting at first, as the syntax is unlike a typical programming language, but like any good domain-specific language (DSL), it’s incredibly powerful once you get the hang of it. This article provides a great overview of how to start thinking in Rego.
We’ve integrated OPA into Fairwinds Insights in three major ways:
As a CI/CD hook, auditing Infrastructure-as-Code as part of the code review process
As an Admission Controller (aka Validating Webhook), which will stop problematic resources from entering the cluster
As an in-cluster agent, repeatedly scanning for problematic resources
Even better, Insights can take the same OPA policies and federate them out to all three contexts, and to as many clusters as you’d like!
OPA doesn’t always have to run inside a cluster - it can also run against YAML that’s checked into your Infrastructure-as-Code repository. This is a great way to detect problems early on in the process, as engineers are still fine-tuning their code.
If you’re just running OPA via the In-Cluster Agent and Admission Controller, and not taking advantage of the Continuous Integration feature, you’ll probably end up with some frustrated developers. They’ll merge their changes in to the main branch, only to find that the deployment process has broken as the Admission Controller rejects their changes. Or maybe an Ops engineer comes knocking a couple weeks later, because the In-Cluster Agent has found some new Action Items caused by their changes. But if teams incorporate the same OPA policies into their CI process, they’ll have been prevented from merging their changes until the problem is fixed.
In Kubernetes, an Admission Controller (or Validating Webhook) listens to the Kubernetes API for new resources entering the cluster. Whenever a user, bot, or in-cluster operator wants to create or update a resource, the Admission Controller gets a chance to examine it first. If the resource violates the Admission Controller’s policy, it can reject the API request, reporting any details back to the caller.
This is a great place to put hard-line policies. For instance, if you want to be sure that none of the workloads in your cluster pull images from quay.io and Docker Hub, Admission Control is a great place to enforce that policy. Of course, an administrator can make exemptions for particular resources and namespaces if need be.
Fairwinds Insights can run OPA as a CronJob inside of a running Kubernetes cluster. In this context, the agent will search for any resources in the cluster which match your policies, and will create an Action Item inside the Fairwinds Insights dashboard.
For each OPA policy, you can customize the following fields in the Insights UI:
This is a great, non-invasive way to get started with OPA. Rather than enforcing policies that you have yet to tune, you can have them run passively on your existing resources to see what they might affect. This is also a great way to run nice-to-have policies — things that shouldn’t necessarily prevent a deployment, but that you still want to have eyes on.
The Open Policy Agent team provides some great open source solutions for running Rego policies in a variety of contexts, like Gatekeeper (their Admission Controller) and Conftest (a CLI for checking YAML files). But a SaaS platform like Fairwinds Insights can provide a much more seamless, unified experience for managing OPA at scale.
One major advantage is the ability to federate out your OPA policies to every cluster in your fleet. With the open source tooling, you’ll need to kubectl apply every new policy to every cluster. Maybe that’s not too much of a strain if you have a couple clusters, but for organizations with dozens to hundreds of clusters, it can become a nightmare. How can you be sure that they’re all running the latest version of every policy? With Fairwinds Insights, you simply upload your policy via the API, and it automatically gets distributed to every cluster in your fleet.
Another advantage is the ability to run the same policies across different contexts. Different open source tools in the OPA suite expect different input formats, so it can be difficult to reuse the same policies in CI that you’re using in Admission Control. But Fairwinds Insights has designed its OPA integration to be consistent across all environments, so if your CI checks pass, you can be sure the Admission Controller won’t complain.
We’re constantly iterating on the Insights platform to provide new features, and we’ve got some great ideas for how to improve our OPA integration in the coming months, especially with respect to dashboarding and alerting.
I’m really excited about the direction Fairwinds Insights is taking by incorporating OPA. It reaffirms our commitment to open standards and to the Kubernetes community, and greatly expands the functionality of our product. When combined with Fairwinds Insights, OPA will give cluster administrators the confidence that their internal policies are being enforced throughout the development and deployment process.