A quick guide to spot-readiness in Kubernetes

A quick guide to spot-readiness in Kubernetes

Applications are occasionally unavailable to serve traffic for a short time. For example, during startup, an application may need to load massive data or configuration files, or it may need to rely on external services. Naturally, you don't want to destroy the program in this situation, but you also don't like to send requests. 

To detect and mitigate these issues, Kubernetes provides readiness probes. For example, through Kubernetes Services, a pod with containers reporting that they are not ready does not receive traffic.

Kubernetes spot-readiness

Due to their lack of availability guarantees, using spot nodes in your Kubernetes cluster can be scary. 

The Kubecost Spot-Readiness Checklist evaluates your Kubernetes workloads in the public cloud to discover candidates for secure scheduling on spot instance models, which can save you up to 90% on cloud resource costs. In addition, cloud-based messaging systems allow users to communicate more securely. This way, companies can also ensure safe mass texting in case they choose it for marketing purposes. 

Kubecost uses your workload configurations to run a series of checks on your AWS (EKS), Google Cloud (GKE), and Azure (AKS clusters to evaluate readiness. It then calculates the cost reductions associated with switching to Spot.

What are spot instances, and why would you want to utilize them?

Spot instances are unused compute instances that public cloud providers sell to customers at a steep discount—possibly up to 90% off. Most popular cloud providers include Microsoft, Google, and AWS, and they provide different enterprise needs such as when building a landing page or storing company files. 

On the other hand, spot nodes change in terms of availability and pricing based on the supply and demand for computing resources at any given time and vary by instance size, family, and deployment location. Examples of computing resources include personal computers, data servers, printers, etc. Thus, spot nodes come in handy when, for example, a marketing professional needs to use Instagram Web or upload specific data. 

Spot instances may receive an interruption notification and spin down within a limited shutdown time if demand for a specific instance type surges (usually a few minutes). 

As a result, spot resources are best suited for fault-tolerant. And adaptable systems such as replicable microservices, Spark/Hadoop nodes, and so on.

Kubernetes supports dynamic workload replication and scalability in a way that allows many applications to be resilient to node loss, making it the ideal tool for embracing spot instances in many respects. 

If spot instances become unavailable, schedulers in AWS EKS will recognize pods or containers running on them and replace them with on-demand resources.

What is the current customer challenge?

Customers regularly inform that they are unaware that Public Cloud Kubernetes services offer spot instances. 

Spot instance support for GKE began in 2018; however, support for AKS (October 2020) and EKS is still in its early stages (Dec. 2020). 

Many clients struggle to determine which workloads are suited for running on spot instances once they know these capabilities. Moreover, it’s essential to check SPF records to find and prevent email recipients and senders from spam and phishing. 

It can be difficult, if not impossible, to identify spot-applicable workloads on your own, especially since Cloud Providers expressly state that they offer "no availability guarantees" and are only suitable for fault-tolerant applications. 

Spot compute resources that are not correctly implemented can result in costly downtime.

Kubecost's Spot-Readiness Checklist 

Replica Count, Controller Type, Local Storage, Manual Annotation Overrides, Rolling Update Strategy, and Controller Pod Disruption Budget are among the six assessments of Kubecost's Spot-Readiness. The Checklist performs on your Kubernetes workloads to determine their spot readiness. 

Each of these qualities is described in detail below. Then, based on the outcomes of all checks, the Checklist will label controllers in the cluster as spot-ready, possibly spot-ready, or not spot-ready.

Replica Count: Workloads with a configured replica count of one are not considered spot-ready. The workload is halted until rescheduled if the single replica is removed from the cluster. Because workloads that may be replicated often allow a variable number of replicas, representation counts larger than one can indicate a level of spot-readiness.

Controller Type: Kubecost is set up to look into a specific set of controllers, which are now Deployments and StatefulSets (but they are constantly introducing new features!).

Deployments are classified as spot-ready because they are stateless to merely ensure that a particular amount of pods are active at any given moment. StatefulSets, on the other hand, should not be regarded as Spot ready in general. Data loss can occur when StatefulSet pods are scheduled on spot nodes. Whether you’re working on creating business cards online or a marketing campaign where you need to give company information, it’s essential to keep that data safe. 

However, if a StatefulSet workload passes all other tests, it's worth considering whether deploying it as a StatefulSet is required.

Local Storage: Workloads are now examined for the presence of an emptyDir volume. If one is present, it is assumed that the workload is not spot-ready.

The presence of a writeable volume, in general, indicates a lack of spot readiness. Data integrity could be jeopardized if a pod is abruptly shut down while it is in the middle of a write. More comprehensive volume checks are being considered right now.

Pod Disruption Budget: For controllers, a Pod Disruption Budget (PDB) can be configured, causing the scheduler to (where possible) conform to particular controller availability criteria. If a controller has a PDB, we detect it and identify the minimum available replicas. We then utilize a reasonable threshold on the ratio to evaluate whether the PDB signals readiness. In the Rolling Update Strategy section, you can see a graphic representation of this ratio computation. Because it implies a relatively high availability criterion, we chose to interpret a ratio of > 0.5 as indicating a lack of readiness.

If you're using this check to determine whether your workloads are spot-ready, don't rule out a workload just because this check fails. Workloads should continuously be assessed individually, and a too strict PDB was probably set up.

(Deployment only) Rolling Update Strategy: Deployments offer several update strategy options, and by default, they are configured with a Rolling Update Strategy (RUS) with maximum availability of 25%. If a RUS is specified in a deployment, we calculate min available in the same way as PDBs (from max unavailable in rounded-down integer form and replica count). Still, we set the threshold at 0.9 instead of 0.5. Default-configured deployments with a replica count of more than three will pass the test if you do it this way.

Override of manual annotations: We also enable manually altering a controller's spot readiness by adding spot.kubecost.com/spot-ready=true to the controller or the namespace it's operating in. Manual annotations are often used in machine learning algorithms, for example, when creating in-app chats

The Checklist for AWS and GCP clusters will also estimate the cost savings of transferring the workload to spot nodes based on your cloud provider's expected spot node pricing.

Freely implement spot nodes in your cluster with Kubecost

Kubecost is a free and open-source application that can help you save money on your Kubernetes workloads by using spot node savings. Kubecost collects user data that never leaves your cluster, allowing clients to immediately implement our technology without remote data sharing security and data governance problems. Customer data is essential for different business analytics such as social media analysis or an eCommerce website monitoring. 

The designation of a task as spot-ready by Kubecost is not a guarantee. Before permitting a workload to execute on spot nodes, a domain expert should thoroughly analyze it. To list only spot-ready workloads on spot nodes, we advocate using taints and tolerations. We're working on functionality that will allow you to turn these recommendations and estimations into more actionable steps, such as enlarging your cluster node groups/pools with Spot and shifting compatible workloads to the new nodes.

Configure Probes

You can utilize a variety of fields on probes to modify the behavior of liveness and readiness checks more precisely:

initialDelaySeconds: The time it takes for liveness or readiness probes to be initiated after the container has started. The default value is 0 seconds. The minimum value is zero.

periodSeconds: How often should the probe be run (in seconds)? The default timeout is 10 seconds. Thus, the smallest value is one. 

timeoutSeconds: The probe will time out after a certain number of seconds. The default value is one second. The smallest value is one.

Threshold: After failing, the probe must have at least two consecutive successes to be declared successful. The default value is 1. For liveness and startup Probes, the value must be 1. The smallest value is one.

failureThreshold: Kubernetes will try failureThreshold times before giving up if a probe fails. In the event of a liveness probe failure, the container will be restarted. The Pod will be labeled Unready if a readiness probe is performed. The default value is 3. The smallest value is one.

Note: Before Kubernetes 1.20, the field timeoutSeconds for exec probes were ignored: probes ran endlessly, even after their configured deadline had passed until a result was returned.

In Kubernetes v1.20, this flaw was fixed. However, because the default timeout is 1 second, you may have relied on prior behavior without realizing it. You can restore the behavior of previous versions by disabling the feature gate ExecProbeTimeout (setting it to false) on each kubelet as a cluster administrator, then removing the override after all the exec probes in the cluster have a timeoutSeconds value set.

If your pods are affected by the default 1 second timeout, you should increase their probe timeout to prepare for the feature gate's future removal.

With the patch for the bug, the process within the container may continue to operate. This will continue even after the probe returns failure due to the timeout on Kubernetes 1.20+ with the dockershim container runtime.

Caution: If readiness probes are implemented incorrectly, they may result in an ever-increasing number of processes in the container, as well as resource hunger if left unchecked.

kubectl bulk operations

The generation of resources isn't the only activity that kubectl may handle in bulk. It can also retrieve resource names from configuration files to execute additional tasks, such as deleting the same resources that you created:

kubectl delete -f https://k8s.io/examples/application/nginx-app.yaml


deployment.apps "my-nginx" deleted
service "my-nginx-svc" deleted


If there are two resources, you can use the resource/name syntax to specify both on the command line:

kubectl delete -f https://k8s.io/examples/application/nginx-app.yaml


deployment.apps "my-nginx" deleted
service "my-nginx-svc" deleted



To filter resources by their labels, you'll find it easier to use the selector (label query) supplied with -l or —selector for greater numbers of resources:

kubectl delete deployment,services -l app=nginx


deployment.apps "my-nginx" deleted
service "my-nginx-svc" deleted


You may use $() or xargs to chain actions because kubectl outputs resource names in the same format it accepts:

kubectl get $(kubectl create -f docs/concepts/cluster-administration/nginx/ -o name | grep service)

kubectl create -f docs/concepts/cluster-administration/nginx/ -o name | grep service | xargs -i kubectl get {}


NAME           TYPE               CLUSTER-IP     EXTERNAL-IP    PORT(S)       AGE
my-nginx-svc   LoadBalancer       10.0.0.208     >pending>    80/TCP        0s


We generate resources in examples/application/nginx/ with the above commands, then print them using the -o name output format (print each resource as resource/name). Then we grep exclusively for "service" and use kubectl to print it.

If your resources are organised across numerous subdirectories within a single directory, you can use the —recursive or -R flag in conjunction with the —filename,-f flag to recursively conduct actions on the subdirectories as well.

Consider the directory project/k8s/development, which contains all of the manifests required for the development environment, arranged by resource type:


project/k8s/development
├── configmap
│   └── my-configmap.yaml
├── deployment
│   └── my-deployment.yaml
└── pvc
    └── my-pvc.yaml


If you do a bulk operation on project/k8s/development by default, it will stop at the first level of the directory and not process any subdirectories. We would have gotten an error if we tried to create the resources in this directory using the following command:

kubectl apply -f project/k8s/development


error: you must provide one or more resources by argument or filename (.json|.yaml|.yml|stdin)

Instead, specify the --recursive or -R flag with the --filename,-f flag as such:

kubectl apply -f project/k8s/development --recursive


configmap/my-config created
deployment.apps/my-deployment created
persistentvolumeclaim/my-pvc created


Any action that accepts the —filename,-f flag, such as kubectl create, get, delete, describe, rollout, and so on, will function with the —recursive flag.

When several -f arguments are provided, the —recursive flag also works:

kubectl apply -f project/k8s/namespaces -f project/k8s/development --recursive


namespace/development created
namespace/staging created
configmap/my-config created
deployment.apps/my-deployment created
persistentvolumeclaim/my-pvc created


Conclusion

Liveness and readiness probes for Kubernetes may significantly improve the robustness and resilience of your service while also providing a better end-user experience. However, if you don't think about how these probes are used, especially if you don't think about unusual system dynamics, you risk making the service's availability worse rather than better.


Reveal misconfigurations within minutes

3 Quick Steps to Get Started