Cluster API v1alpha3 Delivers New Features and an Improved User Experience
Author: Daniel Lipovetsky (D2IQ)
The Cluster API is a Kubernetes project to bring declarative, Kubernetes-style APIs to cluster creation, configuration, and management. It provides optional, additive functionality on top of core Kubernetes to manage the lifecycle of a Kubernetes cluster.
Following the v1alpha2 release in October 2019, many members of the Cluster API community met in San Francisco, California, to plan the next release. The project had just gone through a major transformation, delivering a new architecture that promised to make the project easier for users to adopt, and faster for the community to build. Over the course of those two days, we found our common goals: To implement the features critical to managing production clusters, to make its user experience more intuitive, and to make it a joy to develop.
The v1alpha3 release of Cluster API brings significant features for anyone running Kubernetes in production and at scale. Among the highlights:
- Declarative Control Plane Management
- Support for Distributing Control Plane Nodes Across Failure Domains To Reduce Risk
- Automated Replacement of Unhealthy Nodes
- Support for Infrastructure-Managed Node Groups
For anyone who wants to understand the API, or prizes a simple, but powerful, command-line interface, the new release brings:
- Redesigned clusterctl, a command-line tool (and go library) for installing and managing the lifecycle of Cluster API
- Extensive and up-to-date documentation in The Cluster API Book
Finally, for anyone extending the Cluster API for their custom infrastructure or software needs:
- New End-to-End (e2e) Test Framework
- Documentation for integrating Cluster API into your cluster lifecycle stack
All this was possible thanks to the hard work of many contributors.
Declarative Control Plane Management
The Kubeadm-based Control Plane (KCP) provides a declarative API to deploy and scale the Kubernetes control plane, including etcd. This is the feature many Cluster API users have been waiting for! Until now, to deploy and scale up the control plane, users had to create specially-crafted Machine resources. To scale down the control plane, they had to manually remove members from the etcd cluster. KCP automates deployment, scaling, and upgrades.
What is the Kubernetes Control Plane? The Kubernetes control plane is, at its core, kube-apiserver and etcd. If either of these are unavailable, no API requests can be handled. This impacts not only core Kubernetes APIs, but APIs implemented with CRDs. Other components, like kube-scheduler and kube-controller-manager, are also important, but do not have the same impact on availability.
The control plane was important in the beginning because it scheduled workloads. However, some workloads could continue to run during a control plane outage. Today, workloads depend on operators, service meshes, and API gateways, which all use the control plane as a platform. Therefore, the control plane's availability is more important than ever.
Managing the control plane is one of the most complex parts of cluster operation. Because the typical control plane includes etcd, it is stateful, and operations must be done in the correct sequence. Control plane replicas can and do fail, and maintaining control plane availability means being able to replace failed nodes.
The control plane can suffer a complete outage (e.g. permanent loss of quorum in etcd), and recovery (along with regular backups) is sometimes the only feasible option.
Here's an example of a 3-replica control plane for the Cluster API Docker Infrastructure, which the project maintains for testing and development. For brevity, other required resources, like Cluster, and Infrastructure Template, referenced by its name and namespace, are not shown.
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
kind: KubeadmControlPlane
metadata:
name: example
spec:
infrastructureTemplate:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: DockerMachineTemplate
name: example
namespace: default
kubeadmConfigSpec:
clusterConfiguration:
replicas: 3
version: 1.16.3
Deploy this control plane with kubectl:
kubectl apply -f example-docker-control-plane.yaml
Scale the control plane the same way you scale other Kubernetes resources:
kubectl scale kubeadmcontrolplane example --replicas=5
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/example scaled
Upgrade the control plane to a newer patch of the Kubernetes release:
kubectl patch kubeadmcontrolplane example --type=json -p '[{"op": "replace", "path": "/spec/version", "value": "1.16.4"}]'
Number of Control Plane Replicas By default, KCP is configured to manage etcd, and requires an odd number of replicas. If KCP is configured to not manage etcd, an odd number is recommended, but not required. An odd number of replicas ensures optimal etcd configuration. To learn why your etcd cluster should have an odd number of members, see the .
Because it is a core Cluster API component, KCP can be used with any v1alpha3-compatible Infrastructure Provider that provides a fixed control plane endpoint, i.e., a load balancer or virtual IP. This endpoint enables requests to reach multiple control plane replicas.
What is an Infrastructure Provider? A source of computational resources (e.g. machines, networking, etc.). The community maintains providers for AWS, Azure, Google Cloud, and VMWare. For details, see the
Distributing Control Plane Nodes To Reduce Risk
Cluster API users can now deploy nodes in different failure domains, reducing the risk of a cluster failing due to a domain outage. This is especially important for the control plane: If nodes in one domain fail, the cluster can continue to operate as long as the control plane is available to nodes in other domains.
What is a Failure Domain? A failure domain is a way to group the resources that would be made unavailable by some failure. For example, in many public clouds, an "availability zone" is the default failure domain. A zone corresponds to a data center. So, if a specific data center is brought down by a power outage or natural disaster, all resources in that zone become unavailable. If you run Kubernetes on your own hardware, your failure domain might be a rack, a network switch, or power distribution unit.
The Kubeadm-based ControlPlane distributes nodes across failure domains. To minimize the chance of losing multiple nodes in the event of a domain outage, it tries to distribute them evenly: it deploys a new node in the failure domain with the fewest existing nodes, and it removes an existing node in the failure domain with the most existing nodes.
MachineDeployments and MachineSets do not distribute nodes across failure domains. To deploy your worker nodes across multiple failure domains, create a MachineDeployment or MachineSet for each failure domain.
The Failure Domain API works on any infrastructure. That's because every Infrastructure Provider maps failure domains in its own way. The API is optional, so if your infrastructure is not complex enough to need failure domains, you do not need to support it. This example is for the Cluster API Docker Infrastructure Provider. Note that two of the domains are marked as suitable for control plane nodes, while a third is not. The Kubeadm-based ControlPlane will only deploy nodes to domains marked suitable.
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: DockerCluster
metadata:
name: example
spec:
controlPlaneEndpoint:
host: 172.17.0.4
port: 6443
failureDomains:
domain-one:
controlPlane: true
domain-two:
controlPlane: true
domain-three:
controlPlane: false
The
There are many reasons why a node might be unhealthy. The kubelet process may stop. The container runtime might have a bug. The kernel might have a memory leak. The disk may run out of space. CPU, disk, or memory hardware may fail. A power outage may happen. Failures like these are especially common in larger clusters. Kubernetes is designed to tolerate them, and to help your applications tolerate them as well. Nevertheless, only a finite number of nodes can be unhealthy before the cluster runs out of resources, and Pods are evicted or not scheduled in the first place. Unhealthy nodes should be repaired or replaced at the earliest opportunity. The Cluster API now includes a MachineHealthCheck resource, and a controller that monitors node health. When it detects an unhealthy node, it removes it. (Another Cluster API controller detects the node has been removed and replaces it.) You can configure the controller to suit your needs. You can configure how long to wait before removing the node. You can also set a threshold for the number of unhealthy nodes. When the threshold is reached, no more nodes are removed. The wait can be used to tolerate short-lived outages, and the threshold to prevent too many nodes from being replaced at the same time. The controller will remove only nodes managed by a Cluster API MachineSet. The controller does not remove control plane nodes, whether managed by the Kubeadm-based Control Plane, or by the user, as in v1alpha2. For more, see . Here is an example of a MachineHealthCheck. For more details, see
If you run large clusters, you need to create and destroy hundreds of nodes, sometimes in minutes. Although public clouds make it possible to work with large numbers of nodes, having to make a separate API request to create or delete every node may scale poorly. For example, API requests may have to be delayed to stay within rate limits. Some public clouds offer APIs to manage groups of nodes as one single entity. For example, AWS has AutoScaling Groups, Azure has Virtual Machine Scale Sets, and GCP has Managed Instance Groups. With this release of Cluster API, Infrastructure Providers can add support for these APIs, and users can deploy groups of Cluster API Machines by using the MachinePool Resource. For more information, see the
Experimental Feature
The MachinePool API is an experimental feature that is not enabled by default. Users are encouraged to try it and report on how well it meets their needs. If you are new to Cluster API, your first experience will probably be with the project's command-line tool, clusterctl. And with the new Cluster API release, it has been re-designed to be more pleasing to use than before. The tool is all you need to deploy your first workload cluster in just a few steps. First, use Clusterctl also helps with the "day 2" operations. Use One more thing! Clusterctl is not only a command-line tool. It is also a Go library! Think of the library as an integration point for projects that build on top of Cluster API. All of clusterctl's command-line functionality is available in the library, making it easy to integrate into your stack. To get started with the library, please read its documentation. Thanks to many contributors! The project's documentation is extensive. New users should get some background on the
Above and beyond the content itself, the project's documentation site is a pleasure to use. It is searchable, has an outline, and even supports different color themes. If you think the site a lot like the documentation for a different community project,
The Cluster API project is designed to be extensible. For example, anyone can develop their own Infrastructure and Bootstrap Providers. However, it's important that Providers work in a uniform way. And, because the project is still evolving, it takes work to make sure that Providers are up-to-date with new releases of the core. The End-to-End Test Framework provides a set of standard tests for developers to verify that their Providers integrate correctly with the current release of Cluster API, and help identify any regressions that happen after a new release of the Cluster API, or the Provider. For more details on the Framework, see Testing in the Cluster API Book, and the
Thanks to many contributors! The community maintains guide explains the entire process, from creating a git repository, to creating CustomResourceDefinitions for your Providers, to designing, implementing, and testing the controllers. Under Active Development
The Provider Implementer's Guide is actively under development, and may not yet reflect all of the changes in the v1alpha3 release. The Cluster API project is a very active project, and covers many areas of interest. If you are an infrastructure expert, you can contribute to one of the Infrastructure Providers. If you like building controllers, you will find opportunities to innovate. If you're curious about testing distributed systems, you can help develop the project's end-to-end test framework. Whatever your interests and background, you can make a real impact on the project. Come introduce yourself to the community at our weekly meeting, where we dedicate a block of time for a Q&A session. You can also find maintainers and users on the Kubernetes Slack, and in the Kubernetes forum. Please check out the links below. We look forward to seeing you!Automated Replacement of Unhealthy Nodes
apiVersion: cluster.x-k8s.io/v1alpha3
kind: MachineHealthCheck
metadata:
name: example-node-unhealthy-5m
spec:
clusterName: example
maxUnhealthy: 33%
nodeStartupTimeout: 10m
selector:
matchLabels:
nodepool: nodepool-0
unhealthyConditions:
- type: Ready
status: Unknown
timeout: 300s
- type: Ready
status: "False"
timeout: 300s
Infrastructure-Managed Node Groups
The Cluster API User Experience, Reimagined
clusterctl
clusterctl init
to
clusterctl move
to
The Cluster API Book
Integrate & Customize
End-to-End Test Framework
Provider Implementer's Guide
Join Us!