Logging Architecture
Application logs can help you understand what is happening inside your application. The logs are particularly useful for debugging problems and monitoring cluster activity. Most modern applications have some kind of logging mechanism. Likewise, container engines are designed to support logging. The easiest and most adopted logging method for containerized applications is writing to standard output and standard error streams.
However, the native functionality provided by a container engine or runtime is usually not enough for a complete logging solution.
For example, you may want to access your application's logs if a container crashes, a pod gets evicted, or a node dies.
In a cluster, logs should have a separate storage and lifecycle independent of nodes, pods, or containers. This concept is called cluster-level logging.
Cluster-level logging architectures require a separate backend to store, analyze, and query logs. Kubernetes does not provide a native storage solution for log data. Instead, there are many logging solutions that integrate with Kubernetes. The following sections describe how to handle and store logs on nodes.
Basic logging in Kubernetes
This example uses a Pod
specification with a container
to write text to the standard output stream once per second.
apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
containers:
- name: count
image: busybox:1.28
args: [/bin/sh, -c,
'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']
To run this pod, use the following command:
kubectl apply -f https://k8s.io/examples/debug/counter-pod.yaml
The output is:
pod/counter created
To fetch the logs, use the kubectl logs
command, as follows:
kubectl logs counter
The output is:
0: Mon Jan 1 00:00:00 UTC 2001
1: Mon Jan 1 00:00:01 UTC 2001
2: Mon Jan 1 00:00:02 UTC 2001
...
You can use kubectl logs --previous
to retrieve logs from a previous instantiation of a container.
If your pod has multiple containers, specify which container's logs you want to access by
appending a container name to the command, with a -c
flag, like so:
kubectl logs counter -c count
See the kubectl logs
documentation for more details.
Logging at the node level
A container engine handles and redirects any output generated to a containerized application's By default, if a container restarts, the kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted, along with their logs. An important consideration in node-level logging is implementing log rotation,
so that logs don't consume all available storage on the node. Kubernetes
is not responsible for rotating logs, but rather a deployment tool
should set up a solution to address that.
For example, in Kubernetes clusters, deployed by the As an example, you can find detailed information about how When using a CRI container runtime, the kubelet is responsible for rotating the logs and managing the logging directory structure.
The kubelet sends this information to the CRI container runtime and the runtime writes the container logs to the given location.
The two kubelet parameters When you run There are two types of system components: those that run in a container and those
that do not run in a container. For example: On machines with systemd, the kubelet and container runtime write to journald. If
systemd is not present, the kubelet and container runtime write to Similar to the container logs, system component logs in the While Kubernetes does not provide a native solution for cluster-level logging, there are several common approaches you can consider. Here are some options: You can implement cluster-level logging by including a node-level logging agent on each node. The logging agent is a dedicated tool that exposes logs or pushes logs to a backend. Commonly, the logging agent is a container that has access to a directory with log files from all of the application containers on that node. Because the logging agent must run on every node, it is recommended to run the agent
as a Node-level logging creates only one agent per node and doesn't require any changes to the applications running on the node. Containers write to stdout and stderr, but with no agreed format. A node-level agent collects these logs and forwards them for aggregation. You can use a sidecar container in one of the following ways: By having your sidecar containers write to their own This approach allows you to separate several log streams from different
parts of your application, some of which can lack support
for writing to For example, a pod runs a single container, and the container
writes to two different log files using two different formats. Here's a
configuration file for the Pod: It is not recommended to write log entries with different formats to the same log
stream, even if you managed to redirect both components to the Here's a configuration file for a pod that has two sidecar containers: Now when you run this pod, you can access each log stream separately by
running the following commands: The output is: The output is: The node-level agent installed in your cluster picks up those log streams
automatically without any further configuration. If you like, you can configure
the agent to parse log lines depending on the source container. Note, that despite low CPU and memory usage (order of a couple of millicores
for cpu and order of several megabytes for memory), writing logs to a file and
then streaming them to Sidecar containers can also be used to rotate log files that cannot be
rotated by the application itself. An example of this approach is a small container running If the node-level logging agent is not flexible enough for your situation, you
can create a sidecar container with a separate logging agent that you have
configured specifically to run with your application. Here are two configuration files that you can use to implement a sidecar container with a logging agent. The first file contains
a The second file describes a pod that has a sidecar container running fluentd.
The pod mounts a volume where fluentd can pick up its configuration data. In the sample configurations, you can replace fluentd with any logging agent, reading from any source inside an application container. Cluster-logging that exposes or pushes logs directly from every application is outside the scope of Kubernetes.stdout
and stderr
streams.
For example, the Docker container engine redirects those two streams to
kube-up.sh
script,
there is a
tool configured to run each hour. You can also set up a container runtime to
rotate an application's logs automatically.kube-up.sh
sets
up logging for COS image on GCP in the corresponding
.containerLogMaxSize
and containerLogMaxFiles
in kubelet config file
can be used to configure the maximum size for each log file and the maximum number of files allowed for each container respectively.kubectl logs
as in
the basic logging example, the kubelet on the node handles the request and
reads directly from the log file. The kubelet returns the content of the log file.kubectl logs
. For example, if there's a 10MB file, logrotate
performs
the rotation and there are two files: one file that is 10MB in size and a second file that is empty.
kubectl logs
returns the latest log file which in this example is an empty response.
System component logs
.log
files
in the /var/log
directory. System components inside containers always write
to the /var/log
directory, bypassing the default logging mechanism.
They use the
logging library. You can find the conventions for logging severity for those
components in the ./var/log
directory should be rotated. In Kubernetes clusters brought up by
the kube-up.sh
script, those logs are configured to be rotated by
the logrotate
tool daily or once the size exceeds 100MB.Cluster-level logging architectures
Using a node logging agent
DaemonSet
.Using a sidecar container with the logging agent
stdout
.Streaming sidecar container
stdout
and stderr
streams, you can take advantage of the kubelet and the logging agent that
already run on each node. The sidecar containers read logs from a file, a socket,
or journald. Each sidecar container prints a log to its own stdout
or stderr
stream.stdout
or stderr
. The logic behind redirecting logs
is minimal, so it's not a significant overhead. Additionally, because
stdout
and stderr
are handled by the kubelet, you can use built-in tools
like kubectl logs
.apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
containers:
- name: count
image: busybox:1.28
args:
- /bin/sh
- -c
- >
i=0;
while true;
do
echo "$i: $(date)" >> /var/log/1.log;
echo "$(date) INFO $i" >> /var/log/2.log;
i=$((i+1));
sleep 1;
done
volumeMounts:
- name: varlog
mountPath: /var/log
volumes:
- name: varlog
emptyDir: {}
stdout
stream of
the container. Instead, you can create two sidecar containers. Each sidecar
container could tail a particular log file from a shared volume and then redirect
the logs to its own stdout
stream.apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
containers:
- name: count
image: busybox:1.28
args:
- /bin/sh
- -c
- >
i=0;
while true;
do
echo "$i: $(date)" >> /var/log/1.log;
echo "$(date) INFO $i" >> /var/log/2.log;
i=$((i+1));
sleep 1;
done
volumeMounts:
- name: varlog
mountPath: /var/log
- name: count-log-1
image: busybox:1.28
args: [/bin/sh, -c, 'tail -n+1 -F /var/log/1.log']
volumeMounts:
- name: varlog
mountPath: /var/log
- name: count-log-2
image: busybox:1.28
args: [/bin/sh, -c, 'tail -n+1 -F /var/log/2.log']
volumeMounts:
- name: varlog
mountPath: /var/log
volumes:
- name: varlog
emptyDir: {}
kubectl logs counter count-log-1
0: Mon Jan 1 00:00:00 UTC 2001
1: Mon Jan 1 00:00:01 UTC 2001
2: Mon Jan 1 00:00:02 UTC 2001
...
kubectl logs counter count-log-2
Mon Jan 1 00:00:00 UTC 2001 INFO 0
Mon Jan 1 00:00:01 UTC 2001 INFO 1
Mon Jan 1 00:00:02 UTC 2001 INFO 2
...
stdout
can double disk usage. If you have
an application that writes to a single file, it's recommended to set
/dev/stdout
as the destination rather than implement the streaming sidecar
container approach.logrotate
periodically.
However, it's recommended to use stdout
and stderr
directly and leave rotation
and retention policies to the kubelet.Sidecar container with a logging agent
kubectl logs
because they are not controlled
by the kubelet.
ConfigMap
to configure fluentd.apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
data:
fluentd.conf: |
<source>
type tail
format none
path /var/log/1.log
pos_file /var/log/1.log.pos
tag count.format1
</source>
<source>
type tail
format none
path /var/log/2.log
pos_file /var/log/2.log.pos
tag count.format2
</source>
<match **>
type google_cloud
</match>
apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
containers:
- name: count
image: busybox:1.28
args:
- /bin/sh
- -c
- >
i=0;
while true;
do
echo "$i: $(date)" >> /var/log/1.log;
echo "$(date) INFO $i" >> /var/log/2.log;
i=$((i+1));
sleep 1;
done
volumeMounts:
- name: varlog
mountPath: /var/log
- name: count-agent
image: k8s.gcr.io/fluentd-gcp:1.30
env:
- name: FLUENTD_ARGS
value: -c /etc/fluentd-config/fluentd.conf
volumeMounts:
- name: varlog
mountPath: /var/log
- name: config-volume
mountPath: /etc/fluentd-config
volumes:
- name: varlog
emptyDir: {}
- name: config-volume
configMap:
name: fluentd-config
Exposing logs directly from the application