Health checking gRPC servers on Kubernetes

Author:

Update (December 2021): Kubernetes now has built-in gRPC health probes starting in v1.23. To learn more, see Configure Liveness, Readiness and Startup Probes. This article was originally written about an external tool to achieve the same task.

If you're unfamiliar, Kubernetes health checks (liveness and readiness probes) is what's keeping your applications available while you're sleeping. They detect unresponsive pods, mark them unhealthy, and cause these pods to be restarted or rescheduled.

Kubernetes gRPC health checks natively. This leaves the gRPC developers with the following three approaches when they deploy to Kubernetes:

options for health checking grpc on kubernetes today

  1. httpGet probe: Cannot be natively used with gRPC. You need to refactor your app to serve both gRPC and HTTP/1.1 protocols (on different port numbers).
  2. tcpSocket probe: Opening a socket to gRPC server is not meaningful, since it cannot read the response body.
  3. exec probe: This invokes a program in a container's ecosystem periodically. In the case of gRPC, this means you implement a health RPC yourself, then write and ship a client tool with your container.

Can we do better? Absolutely.

Introducing “grpc-health-probe”

To standardize the "exec probe" approach mentioned above, we need:

  • a standard health check "protocol" that can be implemented in any gRPC server easily.
  • a standard health check "tool" that can query the health protocol easily.

Thankfully, gRPC has a . It can be used easily from any language. Generated code and the utilities for setting the health status are shipped in nearly all language implementations of gRPC.

If you this health check protocol in your gRPC apps, you can then use a standard/common tool to invoke this Check() method to determine server status.

The next thing you need is the "standard tool", and it's the grpc-health-probe.

With this tool, you can use the same health check configuration in all your gRPC applications. This approach requires you to:

  1. Find the gRPC "health" module in your favorite language and start using it (example ).
  2. Ship the grpc_health_probe binary in your container.
  3. Kubernetes "exec" probe to invoke the "grpc_health_probe" tool in the container.

In this case, executing "grpc_health_probe" will call your gRPC server over localhost, since they are in the same pod.

What's next

grpc-health-probe project is still in its early days and it needs your feedback. It supports a variety of features like communicating with TLS servers and configurable connection/RPC timeouts.

If you are running a gRPC server on Kubernetes today, try using the gRPC Health Protocol and try the grpc-health-probe in your deployments, and give feedback.

Further reading