Challenges of a Remotely Managed, On-Premises, Bare-Metal Kubernetes Cluster

Tuesday, August 02, 2016

Today's post is written by Bich Le, chief architect at Platform9, describing how their engineering team overcame challenges in remotely managing bare-metal Kubernetes clusters.

Introduction

The recently announced

Multi-OS bootstrap model

Like its predecessor,

Node management

The first challenge was configuring Kubernetes nodes in the absence of a bare-metal cloud API and SSH access into nodes. We solved it using the node pool concept and configuration management techniques. Every node running the agent automatically shows up in the SaaS portal, which allows the user to authorize the node for use with Kubernetes. A newly authorized node automatically enters a node pool, indicating that it is available but not used in any clusters. Independently, the administrator can create one or more Kubernetes clusters, which start out empty. At any later time, he or she can request one or more nodes to be attached to any cluster. PMK fulfills the request by transferring the specified number of nodes from the pool to the cluster. When a node is authorized, its agent becomes a configuration management agent, polling for instructions from a CM server running in the SaaS application and capable of downloading and configuring software.

Cluster creation and node attach/detach operations are exposed to administrators via a REST API, a CLI utility named qb, and the SaaS-based Web UI. The following screenshot shows the Web UI displaying one 3-node cluster named clus100, one empty cluster clus101, and the three nodes.

Cluster initialization

The first time one or more nodes are attached to a cluster, PMK configures the nodes to form a complete Kubernetes cluster. Currently, PMK automatically decides the number and placement of Master and Worker nodes. In the future, PMK will give administrators an “advanced mode” option allowing them to override and customize those decisions. Through the CM server, PMK then sends to each node a configuration and a set of scripts to initialize each node according to the configuration. This includes installing or upgrading Docker to the required version; starting 2 docker daemons (bootstrap and main), creating the etcd K/V store, establishing the flannel network layer, starting kubelet, and running the Kubernetes appropriate for the node’s role (master vs. worker). The following diagram shows the component layout of a fully formed cluster.

Containerized kubelet?

Another hurdle we encountered resulted from our original decision to run kubelet as recommended by the Multi-node Docker Deployment Guide. We discovered that this approach introduces complexities that led to many difficult-to-troubleshoot bugs that were sensitive to the combined versions of Kubernetes, Docker, and the node OS. Example: kubelet’s need to mount directories containing secrets into containers to support the Service Accounts mechanism. It turns out that

Overcoming networking hurdles

The Beta release of PMK currently uses

Monitoring

One of the values of a SaaS-managed private cloud is constant monitoring and early detection of problems by the SaaS team. Issues that can be addressed without intervention by the customer are handled automatically, while others trigger proactive communication with the customer via UI alerts, email, or real-time channels. Kubernetes monitoring is a huge topic worthy of its own blog post, so we’ll just briefly touch upon it. We broadly classify the problem into layers: (1) hardware & OS, (2) Kubernetes core (e.g. API server, controllers and kubelets), (3) add-ons (e.g.

Future topics

We faced complex challenges in other areas that deserve their own posts: (1) Authentication and authorization with

Conclusion

.

--Bich Le, Chief Architect, Platform9

←Previous
Next→