Defining Network Policy Conformance for Container Network Interface (CNI) providers
Authors: Matt Fenwick (Synopsys), Jay Vyas (VMWare), Ricardo Katz, Amim Knabben (Loadsmart), Douglas Schilling Landgraf (Red Hat), Christopher Tomkins (Tigera)
Special thanks to Tim Hockin and Bowie Du (Google), Dan Winship and Antonio Ojea (Red Hat), Casey Davenport and Shaun Crampton (Tigera), and Abhishek Raut and Antonin Bas (VMware) for being supportive of this work, and working with us to resolve issues in different Container Network Interfaces (CNIs) over time.
A brief conversation around "node local" Network Policies in April of 2020 inspired the creation of a NetworkPolicy subproject from SIG Network. It became clear that as a community, we need a rock-solid story around how to do pod network security on Kubernetes, and this story needed a community around it, so as to grow the cultural adoption of enterprise security patterns in K8s.
In this post we'll discuss:
- Why we created a subproject for
- How we changed the Kubernetes e2e framework to
visualize
NetworkPolicy implementation of your CNI provider - The initial results of our comprehensive NetworkPolicy conformance validator, Cyclonus, built around these principles
- Improvements subproject contributors have made to the NetworkPolicy user experience
Why we created a subproject for NetworkPolicies
In April of 2020 it was becoming clear that many CNIs were emerging, and many vendors implement these CNIs in subtly different ways. Users were beginning to express a little bit of confusion around how to implement policies for different scenarios, and asking for new features. It was clear that we needed to begin unifying the way we think about Network Policies in Kubernetes, to avoid API fragmentation and unnecessary complexity.
For example:
- In order to be flexible to the user’s environment, Calico as a CNI provider can be run using IPIP or VXLAN mode, or without encapsulation overhead. CNIs such as Antrea and Cilium offer similar configuration options as well.
- Some CNI plugins offer iptables for NetworkPolicies amongst other options, whereas other CNIs use a completely different technology stack (for example, the Antrea project uses Open vSwitch rules).
- Some CNI plugins only implement a subset of the Kubernetes NetworkPolicy API, and some a superset. For example, certain plugins don't support the ability to target a named port; others don't work with certain IP address types, and there are diverging semantics for similar policy types.
- Some CNI plugins combine with OTHER CNI plugins in order to implement NetworkPolicies (canal), some CNI's might mix implementations (multus), and some clouds do routing separately from NetworkPolicy implementation.
Although this complexity is to some extent necessary to support different environments, end-users find that they need to follow a multistep process to implement Network Policies to secure their applications:
- Confirm that their network plugin supports NetworkPolicies (some don't, such as Flannel)
- Confirm that their cluster's network plugin supports the specific NetworkPolicy features that they are interested in (again, the named port or port range examples come to mind here)
- Confirm that their application's Network Policy definitions are doing the right thing
- Find out the nuances of a vendor's implementation of policy, and check whether or not that implementation has a CNI neutral implementation (which is sometimes adequate for users)
The NetworkPolicy project in upstream Kubernetes aims at providing a community where people can learn about, and contribute to, the Kubernetes NetworkPolicy API and the surrounding ecosystem.
The First step: A validation framework for NetworkPolicies that was intuitive to use and understand
The Kubernetes end to end suite has always had NetworkPolicy tests, but these weren't run in CI, and the way they were implemented didn't provide holistic, easily consumable information about how a policy was working in a cluster. This is because the original tests didn't provide any kind of visual summary of connectivity across a cluster. We thus initially set out to make it easy to confirm CNI support for NetworkPolicies by making the end to end tests (which are often used by administrators or users to diagnose cluster conformance) easy to interpret.
To solve the problem of confirming that CNIs support the basic features most users care about
for a policy, we built a new NetworkPolicy validation tool into the Kubernetes e2e
framework which allows for visual inspection of policies and their effect on a standard set of pods in a cluster.
For example, take the following test output. We found a bug in
This is the network policy for the test in question: These are the expected connectivity results. The test setup is 9 pods (3 namespaces: x, y, and z;
and 3 pods in each namespace: a, b, and c); each pod runs a server on the same port and protocol
that can be reached through HTTP calls in the absence of network policies. Connectivity is verified
by using the
These results give a visual indication of connectivity in a simple matrix. Going down the leftmost column is the "source"
pod, or the pod issuing the request; going across the topmost row is the "destination" pod, or the pod
receiving the request. A Below are the observed connectivity results in the case of the OVN Kubernetes bug. Notice how the top three rows indicate that
all requests from namespace x regardless of pod and destination were blocked. Since these
experimental results do not match the expected results, a failure will be reported. Note
how the specific pattern of failure provides clear insight into the nature of the problem --
since all requests from a specific namespace fail, we have a clear clue to start our
investigation. This was one of our earliest wins in the Network Policy group, as we were able to
identify and work with the OVN Kubernetes group to fix a bug in egress policy processing. However, even though this tool has made it easy to validate roughly 30 common scenarios,
it doesn't validate all Network Policy scenarios - because there are an enormous number of possible
permutations that one might create (technically, we might say this number is
infinite given that there's an infinite number of possible namespace/pod/port/protocol variations one can create). Once these tests were in play, we worked with the Upstream SIG Network and SIG Testing communities
(thanks to Antonio Ojea and Ben Elder) to put a testgrid Network Policy job in place. This job
continuously runs the entire suite of Network Policy tests against
. Part of our role as a subproject is to help make sure that, when these tests break, we can help triage them effectively. Around the time that we were finishing the validation work, it became clear from the community that,
in general, we needed to solve the overall problem of testing ALL possible Network Policy implementations.
For example, a KEP was recently written which introduced the concept of micro versioning to
Network Policies to accommodate
In response to this increasingly obvious need to comprehensively evaluate Network
Policy implementations from all vendors, Matt Fenwick decided to evolve our approach to Network Policy validation again by creating Cyclonus. Cyclonus is a comprehensive Network Policy fuzzing tool which verifies a CNI provider
against hundreds of different Network Policy scenarios, by defining similar truth table/policy
combinations as demonstrated in the end to end tests, while also providing a hierarchical
representation of policy "categories". We've found some interesting nuances and issues
in almost every CNI we've tested so far, and have even contributed some fixes back. To perform a Cyclonus validation run, you create a Job manifest similar to: Cyclonus outputs a report of all the test cases it will run: Note that Cyclonus tags its tests based on the type of policy being created, because
the policies themselves are auto-generated, and thus have no meaningful names to be recognized by. For each test, Cyclonus outputs a truth table, which is again similar to that of the
E2E tests, along with the policy being validated: Both Cyclonus and the e2e tests use the same strategy to validate a Network Policy - probing pods over TCP or UDP, with
SCTP support available as well for CNIs that support it (such as Calico). As examples of how we use Cyclonus to help make CNI implementations better from a Network Policy perspective, you can see the following issues: The good news is that Antrea and Calico have already merged fixes for all the issues found and other CNI providers are working on it,
with the support of SIG Network and the Network Policy subproject. Are you interested in verifying NetworkPolicy functionality on your cluster?
(if you care about security or offer multi-tenant SaaS, you should be)
If so, you can run the upstream end to end tests, or Cyclonus, or both. In addition to cleaning up the validation story for CNI plugins that implement NetworkPolicies,
subproject contributors have also spent some time improving the Kubernetes NetworkPolicy API for a few commonly requested features.
After months of deliberation, we eventually settled on a few core areas for improvement: Port Range policies: We now allow you to specify a range of ports for a policy.
This allows users interested in scenarios like FTP or virtualization to enable advanced policies.
The port range option for network policies will be available to use in Kubernetes 1.21.
Read more in targeting a range of ports. Namespace as name policies: Allowing users in Kubernetes >= 1.21 to target namespaces using names,
when building Network Policy objects. This was done in collaboration with Jordan Liggitt and Tim Hockin on the API Machinery side.
This change allowed us to improve the Network Policy user experience without actually
changing the API! For more details, you can read
Automatic labelling in the page about Namespaces.
The TL,DR; is that for Kubernetes 1.21 and later, all namespaces have the following label added by default: This means you can write a namespace policy against this namespace, even if you can't edit its labels.
For example, this policy, will 'just work', without needing to run a command such as In our tests, we found that: If you are a CNI provider and interested in helping us to do a better job curating large tests of network policies, please reach out! We are continuing to curate the Network Policy conformance results from Cyclonus
We're also working on some improvements for the future of Network Policies, including: The Network Policy subproject meets on mondays at 4PM EST. For details, check out the
We've gotten a lot of ideas and feedback from users on Network Policies. A lot of people have interesting ideas about Network Policies,
but we've found that as a subproject, very few people were deeply interested in implementing these ideas to the full extent. Almost every change to the NetworkPolicy API includes weeks or months of discussion to cover different cases, and ensure no CVEs are being introduced. Thus, long term ownership
is the biggest impediment in improving the NetworkPolicy user experience for us, over time. We encourage anyone to provide us with feedback, but our most pressing issues right now
involve finding long term owners to help us drive changes. This doesn't require a lot of technical knowledge, but rather, just a long term commitment to helping us stay organized, do paperwork,
and iterate through the many stages of the K8s feature process. If you want to help us and get involved, please reach out on the SIG Network mailing list, or in the SIG Network room in the k8s.io slack channel! Anyone can put an oar in the water and help make NetworkPolices better!metadata:
creationTimestamp: null
name: allow-ingress-port-80
spec:
ingress:
- ports:
- port: serve-80-tcp
podSelector: {}
.
means that the connection was allowed; an X
means the connection was
blocked. For example:Nov 4 16:58:43.449: INFO: expected:
- x/a x/b x/c y/a y/b y/c z/a z/b z/c
x/a . . . . . . . . .
x/b . . . . . . . . .
x/c . . . . . . . . .
y/a . . . . . . . . .
y/b . . . . . . . . .
y/c . . . . . . . . .
z/a . . . . . . . . .
z/b . . . . . . . . .
z/c . . . . . . . . .
Nov 4 16:58:43.449: INFO: observed:
- x/a x/b x/c y/a y/b y/c z/a z/b z/c
x/a X X X X X X X X X
x/b X X X X X X X X X
x/c X X X X X X X X X
y/a . . . . . . . . .
y/b . . . . . . . . .
y/c . . . . . . . . .
z/a . . . . . . . . .
z/b . . . . . . . . .
z/c . . . . . . . . .
Cyclonus: The next step towards Network Policy conformance
apiVersion: batch/v1
kind: Job
metadata:
name: cyclonus
spec:
template:
spec:
restartPolicy: Never
containers:
- command:
- ./cyclonus
- generate
- --perturbation-wait-seconds=15
- --server-protocol=tcp,udp
name: cyclonus
imagePullPolicy: IfNotPresent
image: mfenwick100/cyclonus:latest
serviceAccount: cyclonus
test cases to run by tag:
- target: 6
- peer-ipblock: 4
- udp: 16
- delete-pod: 1
- conflict: 16
- multi-port/protocol: 14
- ingress: 51
- all-pods: 14
- egress: 51
- all-namespaces: 10
- sctp: 10
- port: 56
- miscellaneous: 22
- direction: 100
- multi-peer: 0
- any-port-protocol: 2
- set-namespace-labels: 1
- upstream-e2e: 0
- allow-all: 6
- namespaces-by-label: 6
- deny-all: 10
- pathological: 6
- action: 6
- rule: 30
- policy-namespace: 4
- example: 0
- tcp: 16
- target-namespace: 3
- named-port: 24
- update-policy: 1
- any-peer: 2
- target-pod-selector: 3
- IP-block-with-except: 2
- pods-by-label: 6
- numbered-port: 28
- protocol: 42
- peer-pods: 20
- create-policy: 2
- policy-stack: 0
- any-port: 14
- delete-namespace: 1
- delete-policy: 1
- create-pod: 1
- IP-block-no-except: 2
- create-namespace: 1
- set-pod-labels: 1
testing 112 cases
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
creationTimestamp: null
name: base
namespace: x
spec:
egress:
- ports:
- port: 81
to:
- namespaceSelector:
matchExpressions:
- key: ns
operator: In
values:
- "y"
- z
podSelector:
matchExpressions:
- key: pod
operator: In
values:
- a
- b
- ports:
- port: 53
protocol: UDP
ingress:
- from:
- namespaceSelector:
matchExpressions:
- key: ns
operator: In
values:
- x
- "y"
podSelector:
matchExpressions:
- key: pod
operator: In
values:
- b
- c
ports:
- port: 80
protocol: TCP
podSelector:
matchLabels:
pod: a
policyTypes:
- Ingress
- Egress
0 wrong, 0 ignored, 81 correct
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| TCP/80 | X/A | X/B | X/C | Y/A | Y/B | Y/C | Z/A | Z/B | Z/C |
| TCP/81 | | | | | | | | | |
| UDP/80 | | | | | | | | | |
| UDP/81 | | | | | | | | | |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| x/a | X | X | X | X | X | X | X | X | X |
| | X | X | X | . | . | X | . | . | X |
| | X | X | X | X | X | X | X | X | X |
| | X | X | X | X | X | X | X | X | X |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| x/b | . | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| x/c | . | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| y/a | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| y/b | . | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| y/c | . | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| z/a | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| z/b | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| z/c | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
| | X | . | . | . | . | . | . | . | . |
+--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
Where to start with NetworkPolicy testing?
--ginkgo.focus=NetworkPolicy
flag.
Make sure that you use the K8s conformance image for K8s 1.21 or above (for example, by using the --kube-conformance-image-version v1.21.0
flag),
as older images will not have the new Network Policy tests in them.Improvements to the NetworkPolicy API and user experience
kubernetes.io/metadata.name: <name-of-namespace>
kubectl edit namespace
.
In fact, it will even work if you can't edit or view this namespace's data at all, because of the magic of API server defaulting.apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: test-network-policy
namespace: default
spec:
podSelector:
matchLabels:
role: db
policyTypes:
- Ingress
# Allow inbound traffic to Pods labelled role=db, in the namespace 'default'
# provided that the source is a Pod in the namespace 'my-namespace'
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: my-namespace
Results
The Future
A quick note on User Feedback