Che Logging #10290
Labels
kind/epic
A long-lived, PM-driven feature request. Must include a checklist of items that must be completed.
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
Summary
Logging provides system administrators with information useful for diagnostics and auditing. We propose a logging mechanism that does not require changes to existing Che code. However, we do recommend standardizing the format in which log events are written.
In addition, we propose an option to enable providing additional parameters to log entries in a standard way, to improve supportability.
Technically, the decoupling of the logging mechanism from the code is done by reading standard output on the K8S Pod level. To support this, additional industry-accepted open source components must be deployed to the K8S cluster with special focus on security aspect.
Description
Che epics [Complementary]:
Tracing - #10298, #10288
Monitoring - #10329
Che epics [to be reevaluated]:
Logging - #5483
Logstash - #6537, #7566
Background
Access to the logs of Che agents and applications running within the workspace(aka WS) is required for supportability (analysis, app behavior, monitor), also after the WS was evicted.
Logs should have separate storage and lifecycle independent of nodes and pods.
This concept is called cluster-level-logging which has several common approaches:
Using a node level logging agent is the most common and encouraged approach for K8S cluster because it creates only one agent per node and it doesn’t require any installation on each pod (where logged applications are running). It is based on application’s standard output and standard error.
https://kubernetes.io/docs/concepts/cluster-administration/logging
Logging agents (not refer to Che agent)
Common K8S logging agent options:
They both use fluentd as an agent on the node.
In the open source world, the two most-popular data collectors are Logstash and Fluentd. Logstash is most known for being part of the ELK Stack while Fluentd has become increasingly used by communities of users of software such as Docker, GCP, and Elasticsearch.
Logstash and Fluentd are data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it.
Main differences:
However, the similarities between Logstash and Fluentd are greater than their differences
https://logz.io/blog/fluentd-logstash
https://www.elastic.co/guide/en/logstash/current/introduction.html
https://docs.fluentd.org/v0.12/articles/quickstart
http://larmog.github.io/2016/03/13/elk-cluster-on-kubernetes-on-arm---part-1
http://larmog.github.io/2016/05/02/efk-cluster-on-kubernetes-on-arm---part-2
Common node level agents available for Fluentd
Fluentd
DaemonSet which spawns a pod on each node that reads logs, generated by kubelet, container runtime and containers and sends them to Elasticsearch.
Fluentd is a log collector, processor, and aggregator.
https://logz.io/blog/kubernetes-log-analysis
Fluent-bit (replace the Logstash-forwarder)
Newer agent fully based in the design of Fluentd architecture uses less resources. It is log collector and processor without strong aggregations features such as Fluentd.
https://gist.github.com/StevenACoffman/4e267f0f60c8e7fcb3f77b9e504f3bd7
https://akomljen.com/get-kubernetes-logs-with-efk-stack-in-5-minutes/
Common node level agents available for Logstash
Lightweight way to forward and centralize logs and files. More common outside K8S, but can be used inside K8S to produce to Elasticsearch.
https://www.elastic.co/guide/en/beats/filebeat/current/running-on-kubernetes.html
https://www.elastic.co/blog/shipping-kubernetes-logs-to-elasticsearch-with-filebeat
Container Logs Collection
Cluster level logging collects the standard output and error of the applications running in the containers.
K8S logs the content of the stdout and stderr streams of a pod to a file. It creates one file for each container in a pod. The default location for these files is /var/log/containers. The filename contains: pod name, pod namespace, container name, and container id. The file contains one JSON object per line of the two streams stout and stderr. K8S exposes the content of the log file to clients via its API.
The collection process in Fluentd as an example is done in the following way:
The Fluentd parses the filename of the log file and uses this information to fetch additional metadata from the K8S API. The metadata like labels and annotations are attached to the log event as additional fields so it can be used for search and filter.
The fluentd pod mounts the /var/lib/containers/ host volume to access the logs of all pods scheduled to that Kubelets as well as a host volume for a fluentd position file. This position file saves which log lines are already shipped to the central log store.
Implementation recommendation
There are two kind of custom environment params.
If the log format uses CSV like, based on delimiter, instead of enhanced JSON or XML format then
this param need to be added to each log record (with empty value if not relevant).
The text was updated successfully, but these errors were encountered: