Introducing Astro: Managing Monitors in a Dynamic Environment

In a Kubernetes environment, one of the challenges we see is accurate monitoring management that reflects the state of clusters and workloads. Monitoring is often an afterthought. As workloads change, monitoring is seldom updated to reflect it. Existing tools rely on manually configuring monitoring state, which introduces toil to SRE teams. All of this introduces risk to availability and performance assurance because monitors may not be present or accurate to trigger changes in KPIs (Key Performance Indicators). The result can be breaches in SLAs because issues weren’t detected or noisy pagers contributing to pager fatigue because monitors weren’t set correctly.

To tackle this problem, Fairwinds has introduced a new open source project called Astro. Astro is a Kubernetes operator that watches objects in your cluster for defined patterns, and manages Datadog monitors based on this state. Astro provides 3 key elements to greatly simplify monitor management:

Automated management of the lifecycle of Datadog monitors for workloads running in Kubernetes: Given configuration parameters, the utility will automatically manage defined monitors for all relevant objects within the Kubernetes cluster. As objects change, monitors are updated to reflect that state.
Correlation between logically bound objects: For example, since a namespace is a logical boundary, the tool has the ability to manage monitors for all objects within the namespace.
Templating of values from Kubernetes objects into managed monitors: Any data from a managed Kubernetes object can be inserted into a managed monitor. This makes more informative alerts and can make monitors more context specific.

Astro is available to the public at https://github.com/fairwindsops/astro.

Getting Started

You can run Astro in your cluster with 2 simple steps: configuring your desired monitors, and deploying the helm chart.

Monitors are configured via 1 or more yaml configuration files. These files can be a local filepath or remotely accessed via URL. These configuration files are reloaded periodically so any changes you make will eventually propagate to your monitors. Here’s an example configuration:

In this example, a monitor would be created for any Kubernetes deployments that have the annotations listed in match_annotations (in this case, astro/owner: astro). This particular monitor will alert you when a deployment has no pods available. Astro uses go templating and you can insert any variables from the Kubernetes object or from the cluster_variables section of your config file into monitors. Specific details about the config file syntax is available in Astro’s readme.

Once your config file is in place, you’re ready to deploy with helm! The helm chart is available on GitHub. If you’re installing with Reckoner, a tool to declaratively install and manage multiple Helm chart releases, here’s a snippet to install Astro:

In case you haven’t tried Reckoner, it’s available on GitHub.

Enjoy Astro

Now, every time a deployment is created with the astro/owner annotation, Astro will automatically create a Datadog monitor for it, and we’ll get an alert if it goes down. Without Astro, we would have had to rely on the engineering team to manually create those monitors, which is time consuming and error-prone.

Astro is a great open source project that can be used to keep your monitors in sync, prevent outages due to misconfigured monitors, and fight pager fatigue. We hope you enjoy automated management of your Datadog monitors!