Tell us more

Blog

Upgrade cert-manager - It’s Worth It!

The cert-manager project Automatically provisions and renews through TLS certificate management in Kubernetes. It supports using your own certificate authority, self signed certificates, certificates managed by the Hashicorp Vault PKI, and of course the free certificates issued by Let’s Encrypt. Read on if you would like to use Helm to upgrade cert-manager before version 0.5, without recreating or reissuing Let’s Encrypt certificates.

If you are not yet using cert-manager to manage Let’s Encrypt certificates with Ingresses in Kubernetes, take a look at the Getting Started and ACME Issuer Tutorial cert-manager documents.

Reasons to Upgrade cert-manager

Previous versions of cert-manager could overwhelm the Let’s Encrypt API. This is greatly improved in cert-manager 0.6.0:

  • There are new custom Kubernetes resources for certificate orders and validation challenges that simplify the process with Let’s Encrypt , and make additional detail available for debugging in Kubernetes. You can read more about the new order flow via the proposal for change. The below upgrade walk-through contains an example of information available in the old certificate object vs. the new order and challenge objects.
  • A new validating webhook checks new certificate resources for misconfiguration when they are submitted to the Kubernetes API, before they are submitted to Let’s Encrypt. This catches errors sooner without wasting calls to the Let’s Encrypt API.
  • There are Prometheus metrics for the ACME client that communicates with Let’s Encrypt. You can instrument cert-managers usage of ACME APIs to detect issues and understand behavior before you reach Let’s Encrypt rate limits.

As stated in the cert-manager 0.6.0 release notes:

After extensive testing, we’ve found in the most extreme cases a 100x reduction in ACME API client calls. This is a massive difference, and helps reduce the load that instances of cert-manager put on services like Let’s Encrypt.
As a result, we strongly recommend all users upgrade to the v0.6 release as soon as possible!

Helm Upgrade Removes Certificate Objects

At Fairwinds we use the stable Helm chart to deploy cert-manager. If you haven’t yet used Helm, it helps you manage the desired state of multiple Kubernetes objects. Helm is even more powerful when used via Reckoner, which uses a YAML syntax to install multiple Helm releases at once, and allows installing charts from a git repository.

The cert-manager Helm chart stopped managing custom resource definitions (CRDs) in version 0.5.0, and CRDs should now be installed manually prior to installing versions 0.5+ of the chart. This means a typical helm upgrade ... of cert-manager will end up deleting your Certificate objects from Kubernetes. The Secrets populated by cert-manager do not get deleted, however those Secrets will never be updated because there is no cert-manager Certificate to renew.
I thought that certificates created by annotated Ingress objects would eventually get recreated once cert-manager was reinstalled, but I have not been able to make this happen without an outage.
If you had to recreate and reissue your certificates, Let’s Encrypt has a rate limit of 50 certificates per registered domain per week.

Performing an Upgrade

I’ll walk through an upgrade of cert-manager 0.4.1, to 0.6.2, demonstrating the new order and challenge resources along the way. My cert-manager 0.4.1 was installed in the kube-system namespace, but I’ll be installing cert-manager 0.6 in its own cert-manager namespace. Using a dedicated namespace is a good idea, and this decision is further motivated by a cert-manager requirement to disable validation for its namespace when using the new cert-manager webhook.

Looking at Existing Certificates

I have one valid certificate for “wonder app,” and another certificate that is failing to validate.

A basic listing of the Certificate objects with cert-manager 0.4.1 doesn’t show whether the certificate is valid:

$ kubectl get certificate --all-namespaces
NAMESPACE NAME AGE
default wonder-app-cert 9m
test ivan-test-cert 9m

Describing the ivan-test certificate in the test namespace, shows its validation failed because I haven’t properly configured DNS validation for the Let’s Encrypt ClusterIssuer (noted in the Status.Conditions.Message field at the bottom of the output):

$ kubectl describe certificate -n test ivan-test
Name: ivan-test-cert
Namespace: test
Labels: <none>
Annotations: <none>
API Version: certmanager.k8s.io/v1alpha1
Kind: Certificate
Metadata:
 Creation Timestamp: 2019-02-28T05:53:09Z
 Generation: 1
 Owner References:
 API Version: extensions/v1beta1
 Block Owner Deletion: true
 Controller: true
 Kind: Ingress
 Name: test-ingress
 UID: 202bc781-3b1d-11e9-9fb0-02096e167e86
 Resource Version: 17919
 Self Link: /apis/certmanager.k8s.io/v1alpha1/namespaces/test/
certificates/ivan-test-cert
 UID: 203105f3-3b1d-11e9-9fb0-02096e167e86
Spec:
 Acme:
 Config:
 Dns 01:
 Provider: clouddns
 Domains:
 ivan-test.example.com
 Common Name: 
 Dns Names:
 ivan-test.example.com
 Issuer Ref:
 Kind: ClusterIssuer
 Name: letsencrypt-prod
 Secret Name: ivan-test-cert
Status:
 Acme:
 Order:
 URL: https://acme-v02.api.letsencrypt.org/acme/order/xxxxxxxx/
yyyyyyyyy
 Conditions:
 Last Transition Time: 2019-02-28T05:53:12Z
 Message: ACME server does not allow selected challenge type or no 
provider is configured for domain "ivan-test.example.com"
 Reason: InvalidConfig
 Status: False
 Type: Ready
Events:
 Type Reason Age From Message
 ---- ------ ---- ---- -------
 Normal CreateOrder 10m cert-manager Created new ACME order, 
attempting validation...

Create a New Namespace for cert-manager

If using Kubernetes 1.12 or earlier, The new cert-manager web hook requires that validation be disabled on its namespace to avoid an error about the CABundle field of the ValidatingWebhookConfiguration resource. Rather than disable validation on a namespace that will be shared with other things, cert-manager will be installed in its own namespace. If this step is not completed, cert-manager will not be able to provision certificates for its new webhook correctly, causing a chicken-egg situation. For more detail about the webhook requiring validation to be disabled, see this CABundle Kubernetes issue.

$ kubectl create namespace cert-manager
$ kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true

Copy Secrets To The New Namespace

Any Kubernetes Secrets that cert-manager uses for its Issuers need to exist in the new cert-manager namespace. This does notrelate to the Secrets that hold SSL certificates used by Ingresses!

Secrets that you may want to copy to the new namespace include:

  • Any Secrets referenced by DNS providers configured under the issuer.acme.dns01.providers field of ClusterIssuer objects. These Secrets provide access to a cloud service account which is used to register temporary DNS entries to validate Let’s Encrypt certificates. DNS may be used to validate certificates for private Ingresses that are not accessible by Let’s Encrypt servers.
  • Optionally the ACME account private key Secret referenced by issuer.acme.privateKeySecretRef in ClusterIssuer objects. For Let’s Encrypt ClusterIssuers, cert-manager will create a new private key and store it in a new Secret if one does not already exist, and so far this hasn’t impacted existing certificates with Let’s Encrypt. I can’t speak for the Secrets used by other Issuers - you should probably copy them to the new namespace.

For this example, I’ll copy the private key Secret name from the ClusterIssuer named letsencrypt-prod. First get the name of the Secret from the Cluster Issuer:

$ kubectl get clusterissuer letsencrypt-prod -o jsonpath='{.spec.acme.privateKeySecretRef.name}'

This returns the Secret name cert-manager-setup-production-private-key. My Secret includes “cert-manager-setup” in the name because we use a Helm chart to configure our ClusterIssuers.

With the help of the jq command, copy that Secret to the cert-manager namespace.

$ kubectl get secret -o json -n kube-system cert-manager-setup-production-private-key | jq 'del(.metadata.namespace)' | kubectl create -f - -n cert-manager

The cert-manager related Secrets in the kube-system namespace will be deleted later.

Remember to also copy any Secrets that ClusterIssuers may use for DNS validation!

Backup Related Objects

The Issuer, ClusterIssuer, and Certificate objects will go away when the custom resource definitions (CRDs) are removed as part of the upgrade using Helm. This backup will be restored after the new version of cert-manager and its accompanying CRDs are installed.

$ kubectl get -o yaml \
 --all-namespaces \
 issuer,clusterissuer,certificates > cert-manager-backup.yaml

Note that the Secrets that cert-manager populates, used by Kubernetes Ingresses, will not be touched and do not need to be backed up.

Remove the Old cert-manager

This step will also remove the CRDs and their objects. The Secrets that cert-manager populates, used by Kubernetes Ingresses, will not be touched. Ingresses will continue to respond to HTTPS.

$ helm delete --purge cert-manager

Listing certificates now returns this error, because the CRDs and their objects are gone

$ kubectl get certificates
Error from server (NotFound): Unable to list "certmanager.k8s.io/v1alpha1, 
Resource=certificates": the server could not find the requested resource (get 
certificates.certmanager.k8s.io)

Install New Custom Resource Definitions

The custom resource definitions (CRDs) are no longer managed in the Helm chart, and should be installed prior to version 0.5+ of the chart. The below command is for version 0.6.X of cert-manager.

$ kubectl apply \
 -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.6/deploy/
manifests/00-crds.yaml
customresourcedefinition.apiextensions.k8s.io/certificates.certmanager.k8s.io created
customresourcedefinition.apiextensions.k8s.io/issuers.certmanager.k8s.io created
customresourcedefinition.apiextensions.k8s.io/clusterissuers.certmanager.k8s.io 
created
customresourcedefinition.apiextensions.k8s.io/orders.certmanager.k8s.io created
customresourcedefinition.apiextensions.k8s.io/challenges.certmanager.k8s.io created

However, the CRDs being re-installed does not bring back the Certificate objects, but they will be restored from backup in a future step.

$ kubectl get certificates --all-namespaces
No resources found.

Install cert-manager

Use Helm, or a helper tool like Reckoner, to install version 0.6 of cert-manager. This uses version 0.6.6 of the Helm chart which installs version 0.6.2 of cert-manager.

Install Using Helm

$ helm install stable/cert-manager -n cert-manager --namespace cert-manager 
--version 0.6.6
..... Helm output omitted .....

Install Using Reckoner

Here is a YAML snippet for Reckoner that installs cert-manager, including pre-install hooks that install CRDs and disable namespace validation (already done above) for new installations:

# Use this course.yml file with reckoner to deploy cert-manager,
# For example: reckoner plot course.yml --heading cert-manager
#
# Replace this context with your own.
context: your_actual_kubernetes_context # Get with kubectl config current-context
repositories:
 incubator:
 url: https://kubernetes-charts-incubator.storage.googleapis.com
 stable:
 url: https://kubernetes-charts.storage.googleapis.com
minimum_versions:
 helm: 2.12.3
 reckoner: 1.0.1
charts:
 cert-manager:
 version: "v0.6.6"
 namespace: cert-manager
 set-values:
 # You can place chart values here. . .
 resources:
 requests.cpu: 10m
 requests.memory: 50Mi
 limits.cpu: 15m
 limits.memory: 100Mi
 hooks:
 pre_install:
 - kubectl get namespace cert-manager >/dev/null 2>&1 || kubectl create 
namespace cert-manager
 # The next line is only required with Kubernetes 1.12 or earlier.
 - kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=
true --overwrite=true
 - kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/
release-0.6/deploy/manifests/00-crds.yaml

Restore Object Backups

Restore the backups of Certificate, Issuer, and ClusterIssuer objects - these were deleted by Helm along with the previous version of the cert-manager release.

$ kubectl apply -f cert-manager-backup.yaml
clusterissuer.certmanager.k8s.io/letsencrypt-prod created
clusterissuer.certmanager.k8s.io/letsencrypt-staging created
certificate.certmanager.k8s.io/wonder-app-cert created
certificate.certmanager.k8s.io/ivan-test-cert created

Potential Errors Restoring Objects

If you receive errors like the below, the cert-manager web hook is not yet ready, either because it is still creating its own certificate, or because the CRDs were not installed above. If the Helm chart was installed too quickly before the CRDs were fully applied, the web hook may not have been able to create its certificate.
To resolve this, delete the cert-manager-webhook-xxxxxx pod, so it will restart and realize it’s certificate. Restore the backups again, and there should be no errors.

For more background see the installation troubleshooting doc.

Error from server (InternalError): Internal error occurred: failed calling admission 
webhook "clusterissuers.admission.certmanager.k8s.io": the server is currently unable 
to handle the request

Error from server (InternalError): Internal error occurred: failed calling admission
webhook "certificates.admission.certmanager.k8s.io": the server is currently unable
to handle the request

Inspect cert-manager Objects and Verify Happiness

Certificate objects now have a field named “ready,” which is true for the wonder-app certificate, and false for the ivan-test certificate - reflecting their state before the cert-manager upgrade.

Looking at Issuers and Certificates, we see new certificates used by the cert-manager webhook, and our previous ClusterIssuer and Certificate objects:

$ kubectl get clusterissuer,issuer,certificates --all-namespaces
NAME AGE
clusterissuer.certmanager.k8s.io/letsencrypt-prod 3m
clusterissuer.certmanager.k8s.io/letsencrypt-staging 3m

NAMESPACE NAME AGE
cert-manager issuer.certmanager.k8s.io/cert-manager-webhook-ca 13m
cert-manager issuer.certmanager.k8s.io/cert-manager-webhook-selfsign 13m

NAMESPACE NAME 
cert-manager certificate.certmanager.k8s.io/cert-manager-webhook-ca 
cert-manager certificate.certmanager.k8s.io/cert-manager-webhook-webhook-tls 
default certificate.certmanager.k8s.io/wonder-app-cert 
test certificate.certmanager.k8s.io/ivan-test-cert 
READY SECRET AGE
True cert-manager-webhook-ca 13m
True cert-manager-webhook-webhook-tls 13m
True wonder-app-cert 3m
False ivan-test-cert 3m

View Certificate Expiration

Describing a certificate now shows the expiration toward the bottom of the details - here’s a snippet:

Status:
 Conditions:
 Last Transition Time: 2019-02-28T06:54:46Z
 Message: Certificate is up to date and has not expired
 Reason: Ready
 Status: True
 Type: Ready
 Not After: 2019-05-29T04:52:06Z
Events: <none>

Certificate Validation Failures

After the cert-manager upgrade, the failing ivan-test certificate does not provide enough information about the validation failure in the cert-manager objects. I would have to look at logs for cert-manager pods to find out that I have not configured DNS validation which has caused validation to not be attempted:

$ kubectl describe certificate -n test ivan-test-cert 
Name: ivan-test-cert
Namespace: test
Labels: <none>
Annotations: <none>
API Version: certmanager.k8s.io/v1alpha1
Kind: Certificate
Metadata:
 Creation Timestamp: 2019-02-28T07:20:43Z
 Generation: 1
 Owner References:
 API Version: extensions/v1beta1
 Block Owner Deletion: true
 Controller: true
 Kind: Ingress
 Name: test-ingress
 UID: 4ea85e58-3b29-11e9-9fb0-02096e167e86
 Resource Version: 26973
 Self Link: /apis/certmanager.k8s.io/v1alpha1/namespaces/test/
certificates/ivan-test-cert
 UID: 5c06841a-3b29-11e9-9fb0-02096e167e86
Spec:
 Acme:
 Config:
 Dns 01:
 Provider: clouddns
 Domains:
 ivan-test.example.com
 Dns Names:
 ivan-test.example.com
 Issuer Ref:
 Kind: ClusterIssuer
 Name: letsencrypt-prod
 Secret Name: ivan-test-cert
Status:
 Conditions:
 Last Transition Time: 2019-02-28T07:20:43Z
 Message: Certificate does not exist
 Reason: NotFound
 Status: False
 Type: Ready
Events:
 Type Reason Age From Message
 ---- ------ ---- ---- -------
 Normal OrderCreated 115s cert-manager Created Order resource 
"ivan-test-cert-4093320200"

I would expect the Order object to say something about why validation has not happened, but the only detail is that it’s pending:

$ kubectl describe order -n test ivan-test-cert-4093320200 
Name: ivan-test-cert-4093320200
Namespace: test
Labels: acme.cert-manager.io/certificate-name=ivan-test-cert
Annotations: <none>
API Version: certmanager.k8s.io/v1alpha1
Kind: Order
Metadata:
 Creation Timestamp: 2019-02-28T07:20:43Z
 Generation: 1
 Owner References:
 API Version: certmanager.k8s.io/v1alpha1
 Block Owner Deletion: true
 Controller: true
 Kind: Certificate
 Name: ivan-test-cert
 UID: 5c06841a-3b29-11e9-9fb0-02096e167e86
 Resource Version: 26975
 Self Link: /apis/certmanager.k8s.io/v1alpha1/namespaces/test/orders/
ivan-test-cert-4093320200
 UID: 5c084dcc-3b29-11e9-9fb0-02096e167e86
Spec:
 Config:
 Dns 01:
 Provider: clouddns
 Domains:
 ivan-test.example.com
 Csr: actual CSR omitted
 Dns Names:
 ivan-test.example.com
 Issuer Ref:
 Kind: ClusterIssuer
 Name: letsencrypt-prod
Status:
 Certificate: <nil>
 Finalize URL: https://acme-v02.api.letsencrypt.org/acme/finalize/xxxxxxxx/
yyyyyyyyy
 Reason: 
 State: pending
 URL: https://acme-v02.api.letsencrypt.org/acme/order/xxxxxxxx/yyyyyyyyy
Events: <none>

There is no Challenge object for this certificate, since validation wasn’t attempted due to no DNS provider being configured.

$ kubectl get challenge -n test
No resources found.

In this case I don’t see the cause of the failure (DNS validation is not configured correctly) reflected in any of the above objects, as I did in the previous version of cert-manager. The cert-manager pod logs do reflect this though:

cert-manager-6874795dc8-b76kk cert-manager E0228 07:10:22.018889 
1 controller.go:185] orders controller: Re-queuing item 
"test/ivan-test-cert-4093320200" due to error processing: Error constructing 
Challenge resource for Authorization: ACME server does not allow selected 
challenge type or no provider is configured for domain "ivan-test.example.com"

Cleaning Up

Delete Secrets which were previously copied from the kube-system to the cert-manager namespace. For this example: $ kubectl delete secret -n kube-system cert-manager-setup-production-private-key

Delete any other Secrets from the kube-system namespace that you may have copied to the new namespace, such as those used for DNS certificate validation.

Enjoy cert-manager

Enjoy your upgraded installation of cert-manager, in its own namespace, and future upgrades that are likely less hassle.

Thank you for reading! If you enjoyed this post, or have suggestions, feel free to reach out to @IvanFetch on Twitter.