API Priority and Fairness Alpha
Authors: Min Kim (Ant Financial), Mike Spreitzer (IBM), Daniel Smith (Google)
This blog describes “API Priority And Fairness”, a new alpha feature in Kubernetes 1.18. API Priority And Fairness permits cluster administrators to divide the concurrency of the control plane into different weighted priority levels. Every request arriving at a kube-apiserver will be categorized into one of the priority levels and get its fair share of the control plane’s throughput.
What problem does this solve?
Today the apiserver has a simple mechanism for protecting itself against CPU and memory overloads: max-in-flight limits for mutating and for readonly requests. Apart from the distinction between mutating and readonly, no other distinctions are made among requests; consequently, there can be undesirable scenarios where one subset of the requests crowds out other requests.
In short, it is far too easy for Kubernetes workloads to accidentally DoS the apiservers, causing other important traffic--like system controllers or leader elections---to fail intermittently. In the worst cases, a few broken nodes or controllers can push a busy cluster over the edge, turning a local problem into a control plane outage.
How do we solve the problem?
The new feature “API Priority and Fairness” is about generalizing the existing max-in-flight request handler in each apiserver, to make the behavior more intelligent and configurable. The overall approach is as follows.
- Each request is matched by a Flow Schema. The Flow Schema states the Priority Level for requests that match it, and assigns a “flow identifier” to these requests. Flow identifiers are how the system determines whether requests are from the same source or not.
- Priority Levels may be configured to behave in several ways. Each Priority Level gets its own isolated concurrency pool. Priority levels also introduce the concept of queuing requests that cannot be serviced immediately.
- To prevent any one user or namespace from monopolizing a Priority Level, they may be configured to have multiple queues.
- Finally, when there is capacity to service a request, a
Early results have been very promising! Take a look at this
You are required to prepare the following things in order to try out the feature: After successfully starting your kube-apiservers, you will see a few default FlowSchema and PriorityLevelConfiguration resources in the cluster. These default configurations are designed for a general protection and traffic management for your cluster.
You can examine and customize the default configuration by running the usual tools, e.g.: Upon arrival at the handler, a request is assigned to exactly one priority level and exactly one flow within that priority level. Hence understanding how FlowSchema and PriorityLevelConfiguration works will be helping you manage the request traffic going through your kube-apiservers. FlowSchema: FlowSchema will identify a PriorityLevelConfiguration object and the way to compute the request’s “flow identifier”. Currently we support matching requests according to: the identity making the request, the verb, and the target object. The identity can match in terms of: a username, a user group name, or a ServiceAccount. And as for the target objects, we can match by apiGroup, resource[/subresource], and namespace. PriorityLevelConfiguration: Defines a priority level. We’re already planning a few enhancements based on alpha and there will be more as users send feedback to our community. Here’s a list of them: Possibly treat LIST requests differently depending on an estimate of how big their result will be. As always! Reach us on slack
Many thanks to the contributors that have gotten this feature this far: Aaron Prindle, Daniel Smith, Jonathan Tomer, Mike Spreitzer, Min Kim, Bruce Ma, Yu Liao, Mengyi Zhou!How do I try this out?
--runtime-config="flowcontrol.apiserver.k8s.io/v1alpha1=true"
on the kube-apiservers--feature-gates=APIPriorityAndFairness=true
on the kube-apiservers
kubectl get flowschemas
kubectl get prioritylevelconfigurations
How does this work under the hood?
What’s missing? When will there be a beta?
How can I get involved?