Skip to content

Commit

Permalink
Segregate System Logs From Datapath Logs (#1497)
Browse files Browse the repository at this point in the history
* Add reference architecture for log segregation with Loki

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Add the 'LogKind' field to datapath logs

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Update docs

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Add workflow diagram

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Address Vivek's feedback

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* More feedback items

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Add datapath log field to kube.Exec()'s log output function

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Remove excalidraw image

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Address Pavan's feedback

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Fix broken href

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>
  • Loading branch information
ihcsim committed Jul 21, 2022
1 parent 4219d50 commit f815219
Show file tree
Hide file tree
Showing 12 changed files with 282 additions and 8 deletions.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ and easy to install, operate and scale.
install
tutorial
architecture
tasks
tooling
functions
templates
Expand Down
8 changes: 8 additions & 0 deletions docs/spelling_wordlist.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
actionset
actionsets
args
aws
backend
Expand All @@ -7,9 +8,13 @@ backupID
backupInfo
backupTag
boolean
datapath
datastore
defaultProfile
DaemonSet
Dockerfile
Elasticsearch
grafana
gcs
gibibytes
GiB
Expand All @@ -22,16 +27,19 @@ kanister
kubectl
Kubernetes
lifecycle
loki
metadata
namespace
objectstore
observability
outputArtifact
param
params
PersistentVolumeClaim
pluggable
pre
prepopulated
promtail
Quickstart
repo
rollout
Expand Down
8 changes: 8 additions & 0 deletions docs/tasks.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
.. _tasks:

Tasks
*****
.. toctree::
:maxdepth: 1

tasks/logs.rst
Binary file added docs/tasks/img/logs-grafana-data-source.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/tasks/img/logs-grafana-login.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/tasks/img/logs-grafana-loki-test.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/tasks/img/logs-kanister-all-logs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/tasks/img/logs-kanister-datapath-logs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
246 changes: 246 additions & 0 deletions docs/tasks/logs.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,246 @@
Segregate Controller And Datapath Logs
--------------------------------------

Kanister uses structured logging to ensure that its logs can be easily
categorized, indexed and searched by downstream log aggregation software.

By default, Kanister logs are output to the controller's ``stderr`` in JSON
format. Generally, these logs can be categorized into *system logs* and
*datapath logs*.

System logs are logs emitted by the Kanister to track important controller
events like interactions with the Kubernetes APIs, CRUD operations on
blueprints and actionsets etc.

Datapath logs, on the other hand, are logs emitted by task pods created by
Kanister. These logs are streamed to the Kanister controller before the task
pods are terminated to ensure they are not lost inadvertently. Datapath log
lines usually include the ``LogKind`` field, with its value set to
``datapath``.

The rest of this documentation provides instructions on how to segregate
Kanister's system logs from datapath logs using Loki_ and Grafana_.

To run the provided commands, access to a Kubernetes cluster using the
``kubectl`` and ``helm`` command-line tools is required.

Follow the instructions in the installation_ page to deploy Kanister on the
cluster.

Deployments Setup
=================

The commands and screenshots in this documentation are tested with the following
software versions:

* Loki 2.5.0
* Grafana 8.5.3
* Promtail 2.5.0

Let's begin by installing Loki. Loki is a datastore optimized for holding log
data. It indexes log data via streams made up of logs, where each stream is
associated with a unique set of labels.

.. code-block:: bash
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm -n loki install --create-namespace loki grafana/loki \
--set image.tag=2.5.0
Confirm that the Loki StatefulSet is successfully rolled out:

.. code-block:: bash
kubectl -n loki rollout status sts/loki
.. note::
The Loki configuration used in this installation is meant for demonstration
purposes only. The Helm chart deploys a non-HA single instance of Loki,
managed by a StatefulSet workload. See the `Loki installation documentation`_
for other installation methods that may be more suitable for your
requirements.

Use Helm to install Grafana with a pre-configured Loki data source:

.. code-block:: bash
svc_url=$(kubectl -n loki get svc loki -ojsonpath='{.metadata.name}.{.metadata.namespace}:{.spec.ports[?(@.name=="http-metrics")].port}')
cat <<EOF | helm -n grafana install --create-namespace grafana grafana/grafana -f -
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Loki
type: loki
url: http://$svc_url
access: proxy
isDefault: true
EOF
Confirm that the Grafana Deployment is successfully rolled out:
.. code-block:: bash
kubectl -n grafana rollout status deploy/grafana
Set up port-forward to access the Grafana UI:
.. code-block:: bash
kubectl -n grafana port-forward svc/grafana 3000:80
Use a web browser to navigate to ``localhost:3000``:
.. image:: img/logs-grafana-login.png
The default login username is ``admin``.
The login password can be retrieved using the following command:
.. code-block:: bash
kubectl -n grafana get secret grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
Navigate to the data sources configuration under ``Configuration`` >
``Data Sources`` using the left-hand panel.
Confirm that the ``Loki`` data source has already been added as part of the
Grafana installation:
.. image:: img/logs-grafana-data-source.png
Access the ``Loki`` data source configuration page.
Use the ``Test`` button near the bottom of the page to test the connectivity
between Grafana and Loki:
.. image:: img/logs-grafana-loki-test.png
The final step in the setup involves installing Promtail. Promtail is an agent
that can be used to discover log targets and stream their logs to Loki:
.. code-block:: bash
svc_url=$(kubectl -n loki get svc loki -ojsonpath='{.metadata.name}.{.metadata.namespace}:{.spec.ports[?(@.name=="http-metrics")].port}')
helm -n loki upgrade --install --create-namespace promtail grafana/promtail \
--set image.tag=2.5.0 \
--set "config.clients[0].url=http://${svc_url}/loki/api/v1/push"
Confirm that the Promtail DaemonSet is successfully rolled out:
.. code-block:: bash
kubectl -n loki rollout status ds/promtail
Logs Segregation
================
To simulate a steady stream of log lines, the next step defines a blueprint that
uses flog_ to generate Apache common and error logs:
.. code-block:: bash
cat<<EOF | kubectl apply -f -
apiVersion: cr.kanister.io/v1alpha1
kind: Blueprint
metadata:
name: stream-apache-logs
namespace: kanister
actions:
flogTask:
phases:
- func: KubeTask
name: taskApacheLogs
args:
namespace: "{{ .Namespace.Name }}"
image: mingrammer/flog:0.4.3
command:
- flog
- -f
- apache_combined
- -n
- "120"
- -s
- 0.5s
EOF
Create the following actionset to invoke the ``flogTask`` action in the
blueprint:
.. code-block:: bash
cat<<EOF | kubectl create -f -
apiVersion: cr.kanister.io/v1alpha1
kind: ActionSet
metadata:
generateName: stream-apache-logs-task-
namespace: kanister
spec:
actions:
- name: flogTask
blueprint: stream-apache-logs
object:
kind: Namespace
name: default
EOF
Head over to the *Explore* pane in the Grafana UI.
Ensure that the ``Loki`` data source is selected.
Enter the following LogQL_ query in the *Log Browser* input box to retrieve
all Kanister logs:
.. code-block:: bash
{namespace="kanister"}
The log outputs should look similar to this:
.. image:: img/logs-kanister-all-logs.png
Use the next query to select only the datapath logs, replacing ``${actionset}``
with the name of the recently created actionset:
.. code-block:: bash
{namespace="kanister"} | json | LogKind="datapath",ActionSet="${actionset}"
The *Logs* pane should only display Apache log lines generated by flog:
.. image:: img/logs-kanister-datapath-logs.png
LogQL is a very expressive language inspired by PromQL. There is so much more
one can do with it. Be sure to check out its
`documentation <https://grafana.com/docs/loki/latest/logql/log_queries/>`_ for
other use cases that involve more advanced line and label filtering, formatting
and parsing.
Wrap Up
=======
As seen in this documentation, Kanister's consistent structured log lines allow
one to easily integrate Kanister with more advanced log aggregation solutions to
improve ensure better observability within the data protection workflows.
To remove Loki, Grafana and Promtail, use the following ``helm`` commands:
.. code-block:: bash
helm -n grafana uninstall grafana
helm -n loki uninstall promtail
helm -n loki uninstall loki
.. _Loki: https://grafana.com/oss/loki/
.. _Grafana: https://grafana.com/oss/grafana
.. _flog: https://github.com/mingrammer/flog
.. _Loki installation documentation: https://grafana.com/docs/loki/latest/installation/
.. _LogQL: https://grafana.com/docs/loki/latest/logql/
.. _installation: ../install.html
11 changes: 7 additions & 4 deletions pkg/consts/consts.go
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
package consts

const (
ActionsetNameKey = "ActionSet"
PodNameKey = "Pod"
ContainerNameKey = "Container"
PhaseNameKey = "Phase"
ActionsetNameKey = "ActionSet"
PodNameKey = "Pod"
ContainerNameKey = "Container"
PhaseNameKey = "Phase"
LogKindKey = "LogKind"
LogKindDatapath = "datapath"

GoogleCloudCredsFilePath = "/tmp/creds.txt"
LabelKeyCreatedBy = "createdBy"
LabelValueKanister = "kanister"
Expand Down
15 changes: 11 additions & 4 deletions pkg/format/format.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import (
"regexp"
"strings"

"github.com/kanisterio/kanister/pkg/consts"
"github.com/kanisterio/kanister/pkg/field"
"github.com/kanisterio/kanister/pkg/log"
pkgout "github.com/kanisterio/kanister/pkg/output"
Expand Down Expand Up @@ -48,9 +49,10 @@ func LogTo(w io.Writer, pod string, container string, output string) {

if strings.TrimSpace(line) != "" {
fields := field.M{
"Pod": pod,
"Container": container,
"Out": line,
"Pod": pod,
"Container": container,
"Out": line,
consts.LogKindKey: consts.LogKindDatapath,
}
log.PrintTo(w, "action update", fields)
}
Expand Down Expand Up @@ -92,6 +94,11 @@ func LogWithCtx(ctx context.Context, podName string, containerName string, outpu

func infoWithCtx(ctx context.Context, podName string, containerName string, l string) {
if strings.TrimSpace(l) != "" {
log.WithContext(ctx).Print("Pod Update", field.M{"Pod": podName, "Container": containerName, "Out": l})
log.WithContext(ctx).Print("Pod Update", field.M{
"Pod": podName,
"Container": containerName,
"Out": l,
consts.LogKindKey: consts.LogKindDatapath,
})
}
}
1 change: 1 addition & 0 deletions pkg/function/kube_task.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ func kubeTaskPodFunc(cli kubernetes.Interface) func(ctx context.Context, pod *v1
return nil, errors.Wrapf(err, "Failed while waiting for Pod %s to be ready", pod.Name)
}
ctx = field.Context(ctx, consts.PodNameKey, pod.Name)
ctx = field.Context(ctx, consts.LogKindKey, consts.LogKindDatapath)
// Fetch logs from the pod
r, err := kube.StreamPodLogs(ctx, cli, pod.Namespace, pod.Name)
if err != nil {
Expand Down

0 comments on commit f815219

Please sign in to comment.