ResourcesMore Than Filled IconKubernetes version deprecation and cluster upgrades from the application side

Kubernetes version deprecation and cluster upgrades from the application side

Kubernetes version deprecation and cluster upgrades from the application side

Guide to Using Datree to Defend Against Kubernetes Upgrades

As application developers focus on shipping code, your team doesn't have time to deal with vendor updates that break your CI/CD workflow. Kubernetes, with its active community and regular release rhythm, can introduce changes that break your flow and cost you time.

Staying current with Kubernetes updates is important—especially for security and support—but this means being proactive to mitigate upgrade issues before they hit production. In this article, we'll look at an example of just that kind of situation, and then see how Datree can save the day.

Example problem

The upgrade from Kubernetes 1.15 to 1.16 was especially problematic for development teams because several previously deprecated APIs were finally removed. Deployments that worked on 1.15 were broken in 1.16. Let's demonstrate this with a Kubernetes 1.15 cluster on AWS EKS. We'll run a simple demo application, using continuous deployment via GitHub Actions, and then see how a cluster upgrade to 1.16 breaks the deployment pipeline. Finally, we'll add Datree to the mix to show how it protects precisely against this problem and others.

Sample Application

Our "Hello world" application is an nginx container serving some static HTML content, with CD via a GitHub Action. When we push to the main branch the following happens:

  • Check out the latest code
  • Build our docker image
  • Push it to our ECR, tagged with the SHA of the latest commit
  • Update our deployment manifest, supplying the updated docker image tag
  • Authenticate to our Kubernetes cluster
  • Apply the updated manifest

Upgrading the cluster

Let's see what happens when we upgrade our cluster to Kubernetes version 1.16.

In our AWS console, we find the EKS cluster and click the "Update Now" link.

Image showing the 'Update now' link in AWS console

We ignore the helpful warning, and go ahead and upgrade the cluster to Kubernetes 1.16.

Image showing warning about resources that need upgrading
Update cluster version

Re-deploy the application

After the upgrade to 1.16 has completed, we commit a change to the index.html file. This time, when the CD workflow runs, we see an error:

 

Image showing CD pipeline failure

Failure! In this case, the error is quite obvious:

no matches for kind "Deployment" in version "apps/v1beta1"

In our deployment manifest, kubernetes-deploy.yaml, we specified our deployment like this:

apiVersion: apps/v1beta1
kind: Deployment

The change log notes in on the Kubernetes Blog says:

Deployment in the extensions/v1beta1, apps/v1beta1, and apps/v1beta2 API versions is no longer served. Migrate to use the apps/v1 API version, available since v1.9.

So, we can fix our CD pipeline by changing our deployment specification to look like this:

apiVersion: apps/v1
kind: Deployment

Of course, we're working with an extremely simple example, so a small issue like this is easy to detect and fix.

But in a real-world situation, things could be a lot more complicated.

For example, if our application were deployed using a complex Helm chart, tracking down and fixing the exact problem could be much harder. This is especially true for a team focused on development rather than DevOps and Kubernetes. Sure, there are workarounds like delaying cluster upgrades or constantly broadcasting a heads-up about forthcoming changes; but workarounds are shortsighted band-aids. What's needed is a robust, automated solution.

Automated protection

 

Datree prevents these kinds of problems from ever reaching production.

Running Datree locally

Before adding a Datree step to our GitHub Action, let's first see how it works locally. On a Linux/Mac machine, install the binary by following Datree's "Getting Started" documentation:

curl https://get.datree.io | /bin/bash

Testing our Kubernetes manifest

Datree runs against a Kubernetes YAML manifest. We'll run it against our kubernetes-deploy.yaml file. (If you're working with a manifest template, you may need to replace the image value with a valid image rather than one that AWS would populate at real-time.)

Here's what the Datree test tells us:

$ datree test kubernetes-deploy.yamltpl
>>  File: kubernetes-deploy.yamltpl

❌  Ensure each container has a configured CPU limit  [1 occurrences]
💡  Missing property object `limits.cpu` - value should be within the accepted boundaries recommended by the organization

❌  Ensure each container has a configured memory limit  [1 occurrences]
💡  Missing property object `limits.memory` - value should be within the accepted boundaries recommended by the organization

❌  Ensure each container has a configured memory request  [1 occurrences]
💡  Missing property object `requests.memory` - value should be within the accepted boundaries recommended by the organization

❌  Ensure each container has a configured liveness probe  [1 occurrences]
💡  Missing property object `livenessProbe` - add a properly configured livenessProbe to catch possible deadlocks

❌  Ensure each container has a configured readiness probe  [1 occurrences]
💡  Missing property object `readinessProbe` - add a properly configured readinessProbe to notify kubelet your Pods are ready for traffic

❌  Ensure Deployment has more than one replica configured  [1 occurrences]
💡  Incorrect value for key `replicas` - don't relay on a single pod to do all of the work. Running 2 or more replicas will increase the availability of the service

❌  Prevent deprecated APIs in Kubernetes v1.16  [1 occurrences]
💡  Incorrect value for key `apiVersion` - the version you are trying to use is not supported by the Kubernetes cluster version (>=1.16)

❌  Ensure each container has a configured CPU request  [1 occurrences]
💡  Missing property object `requests.cpu` - value should be within the accepted boundaries recommended by the organization


+-----------------------------------+----------------------------------------------------------+
| Enabled rules in policy “default” | 21                                                       |
| Configs tested against policy     | 1                                                        |
| Total rules evaluated             | 21                                                       |
| Total rules failed                | 8                                                        |
| Total rules passed                | 13                                                       |
| See all rules in policy           | https://app.datree.io/login?cliId=UKaFJiBYBucAETAmGSA2pi |
+-----------------------------------+----------------------------------------------------------+

As you can see, Datree found several errors. If we were deploying a production-ready manifest, this test output is critical, cueing us to resolve these issues before moving forward.

For our simple "Hello, World!" example, we can disable certain policy rules that are irrelevant or unnecessary. Datree uses a centralized list of default policy rules which we can configure.

Customizing the Datree policy rules

 

In our output just above, you see a URL for "See all rules in policy." Datree provides centralized management of policy rules across all of your connected projects.

Visit your policy rules URL in a browser. After logging in with GitHub, you'll see a page with your Datree policy rules:

Datree policy rules image

We'll ignore certain rules Those are all default rules we're violating to keep our "Hello, World!" deployment as simple as possible. Turn off the following rules:

  • Ensure each container has a configured CPU request
  • Ensure each container has a configured CPU limit
  • Ensure each container has a configured memory limit
  • Ensure each container has a configured memory request
  • Ensure each container has a configured liveness probe
  • Ensure each container has a configured readiness probe
  • Ensure Deployment has more than one replica configured

When we re-run our Datree test, this is our result:

$ datree test kubernetes-deploy.tpl
>>  File: kubernetes-deploy.tpl

❌  Prevent deprecated APIs in Kubernetes v1.16  [1 occurrences]
💡  Incorrect value for key `apiVersion` - the version you are trying to use is not supported by the Kubernetes cluster version (>=1.16)


+-----------------------------------+----------------------------------------------------------+
| Enabled rules in policy “default” | 13                                                       |
| Configs tested against policy     | 1                                                        |
| Total rules evaluated                   | 13                                                       |
| Total rules failed                          | 1                                                        |
| Total rules passed                       | 12                                                       |
| See all rules in policy                  | https://app.datree.io/login?cliId=ryJEUYtx6m7GqS6JgUaoKm |
+-----------------------------------+----------------------------------------------------------+

 

Adding Datree to our CD pipeline

Finally, let's automate this process by adding Datree to our CD pipeline. We need to set a GitHub Actions Secret for Datree, containing the API token it uses to identify our account. You can get this value from the Datree UI by visiting:

https://app.datree.io/user

Take the API token and set it as the value of a DATREE_TOKEN GitHub Actions Secret in your repository.

Then, alter the GitHub Action to populate an environment variable with that value, and install and run Datree, like this:

env:
  ...
   DATREE_TOKEN: ${{ secrets.DATREE_TOKEN }}
 jobs:
   main:
     runs-on: ubuntu-latest
     steps:
       ...
       - name: Update image tag
         run: ...
       - name: Install datree
         run: curl https://get.datree.io | /bin/bash
       - name: Run datree
         run: datree test kubernetes-deploy.yaml
       ...env:  ...     DATREE_TOKEN: ${{ secrets.DATREE_TOKEN }} jobs:   main:         runs-on: ubuntu-latest             steps:      ...      - name: Update image tag        run: ...      - name: Install datree  # <-- add this step        run: curl https://get.datree.io | /bin/bash      - name: Run datree      # <-- add this step        run: datree test kubernetes-deploy.yaml      ...

You can see the full content of the GitHub Action here.

Now when we push this change to our GitHub repo, the action fails like this:

GitHub Action failing on datree test

Caught! The failure occurs before we apply the manifest to our Kubernetes cluster, so our original working deployment is not affected. Datree caught this problem before it started to cause issues.

As before, we can fix our kubernetes-deploy.yaml file by changing the apiVersion, and then push the change to our repository.

Once the manifest has been fixed, the test step will pass, and our workflow will continue and apply the manifest to our Kubernetes cluster.This is a simple demonstration, so I've been pushing changes to my main branch to trigger the workflow. In a real project, main would be a protected branch, and the workflow would be triggered whenever a PR is raised, so the problem would be caught even earlier.

Conclusion

Kubernetes has evolved DevOps through automation of deployment and scaling. But Kubernetes itself is evolving, making it difficult for development teams to keep up with rapidly-changing APIs. Falling behind can cause failures and lead to frustrating and costly service outages. Incorporating automated policy compliance tools like Datree into CI/CD pipelines can help catch errors before they impact production services.

Test your first config files to prevent misconfiguration

Try it now!

Prevent Kubernetes Misconfigurations NOW!

Other resources
No items found.