Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design questions #110

Closed
cocowalla opened this issue Dec 14, 2018 · 7 comments
Closed

Design questions #110

cocowalla opened this issue Dec 14, 2018 · 7 comments

Comments

@cocowalla
Copy link

Hi, I have a few questions that would help me understand Loki better, and I didn't find the answers in the design document:

  1. Batching logs will allow better compression ratios and bigger blobs (which means lower per-operation costs), but must be balanced with the risk of data loss - what is the strategy here?

  2. Can Loki this handle multi-line logs? Let's say a regex matches a line that is really part of a multi-line log - will the search then return all the related lines for that log?

  3. Labels are important for Loki, and I understand that the focus is on k8s first, with automatic labelling
    3.a. Exactly what labels are automatically assigned?
    3.b. Is there some mechanism to add your own labels, for example to group sources by operating system, or are you limited only to those labels assigned by Loki?
    3.c. If you are limited to labels assignd by Loki, how will you handle labelling as you expand out of k8s and accept logs for other sources (e.g. syslog)?

  4. Logs really have 2 timestamps associated with them: the time the log was generated, and the time the log was ingested by Loki - will Loki be able to parse the log generation time out of logs (where it's included, and it usually is, such as with syslog), or will it only use the log ingestion time for time range searches?

@tomwilkie
Copy link
Contributor

Batching logs will allow better compression ratios and bigger blobs (which means lower per-operation costs), but must be balanced with the risk of data loss - what is the strategy here?

100% agree - Loki already batches entries for the same log stream into what we call a "chunk", and then flushes these chunks to S3/GCS etc. To protect against dataloss, each entry is written to ~3 different replicas, and soon the replicas will maintain a write ahead logs.

Can Loki this handle multi-line logs? Let's say a regex matches a line that is really part of a multi-line log - will the search then return all the related lines for that log?

Not yet; this is something we're discussing in #74.

Exactly what labels are automatically assigned?

This is completely in the control of the end user, and configures by our use of Prometheus service discover and relabelling riles.

The only one that is automatically produced is the __filename__ label at the moment, although I'm sure we'll add more.

Most setups will also have a job label, which be default takes on the name from the scrape config. On kubernetes (with our example configs), the job labels consists of <namespace>/<pod name>, we add an independent namespace label, and we propagate any labels from the pods themselves, such as version, app etc.

Is there some mechanism to add your own labels, for example to group sources by operating system, or are you limited only to those labels assigned by Loki?

Of course :-) You can configure promtail to extract arbitrary metadata from your chosen service discovery mechanism, plus you can configure promtail for "external labels" that are appended to every outgoing stream - useful for things like OS version, cluster name, hostname etc.

3.c. If you are limited to labels assignd by Loki, how will you handle labelling as you expand out of k8s and accept logs for other sources (e.g. syslog)?

Good question! And I don't know. Right now we're thinking of adding the ability to extract labels from the log entry using a regular expression, and things like journald already have a fair bit of metadata.

Logs really have 2 timestamps associated with them: the time the log was generated, and the time the log was ingested by Loki - will Loki be able to parse the log generation time out of logs (where it's included, and it usually is, such as with syslog), or will it only use the log ingestion time for time range searches?

Current we have two options, but for both of them the timestamp comes from the promtail agent, and never from the loki server. Option (1) is for docker-style json logs (as produced by k8s nodes), where there is a timestamp included that is the time the log was written to the pipe by the container. Option (2) is to use the time at which promtail read the log, which assuming there isn't a lot of catch up to do will be pretty accurate.

We're already planning on adding the ability for promtail to extract timestamps for the log entries themselves.

@cocowalla
Copy link
Author

Thanks for the quick and informative response!

One thing I'm still not clear on though, is the mechanics of the batching:

  1. By 'log stream', do you mean "all logs from a particular source"? If so, is the assumption that all logs from each source always have the same, constant set of labels associated with them? Perhaps anonther way of phrasing this, is are labels associated with a log stream, or with individual logs?

  2. Does 1 chunk equate to one S3 blob, or just part of a larger blob?

  3. When you talk of entries being written to replicas, do you mean each log is written to 3 different S3 buckets, or do you mean to some local caching mechanism ? (sorry if this is a silly question, but I'm not familiar with any details of S3!)

@tomwilkie
Copy link
Contributor

By 'log stream', do you mean "all logs from a particular source"? If so, is the assumption that all logs from each source always have the same, constant set of labels associated with them? Perhaps anonther way of phrasing this, is are labels associated with a log stream, or with individual logs?

A log stream is defined as all entries with the same labels - typically this would be all entries from a single source, but in the k8s/docker cases we split STDERR and STDOUT into two different streams for each container. When we tail local files, each file becomes a stream.

Does 1 chunk equate to one S3 blob, or just part of a larger blob?

Each chunk becomes a single blob; a stream is made of multiple chunks.

When you talk of entries being written to replicas, do you mean each log is written to 3 different S3 buckets, or do you mean to some local caching mechanism ?

Each entry is replicated to 3 ingesters, and will appear in 3 chunks. These three chunks will be written to the same bucket.

@cocowalla
Copy link
Author

That was really informative, thanks!

Each entry is replicated to 3 ingesters, and will appear in 3 chunks. These three chunks will be written to the same bucket.

Do the investors persist to local storage while buffering chunks, or are they stateless, buffering in memory?

I'm not familiar with S3, but I'm guessing one chunk is the 'master' and the other 2 are replicas/copies that are held in different availability zones (for fault tolerance within the same region)?

@daixiang0
Copy link
Contributor

Can i understand that loki just create something like index for log files, if log files changed or updated, the index thing would auto update? So that loki do not spend too much disk to store even with 3 replicas?

@cocowalla
Copy link
Author

@icereed to avoid confusion, I think it would be better if you removed this comment and opened a separate issue with your feature request, leaving this one free and uncluttered for the original Q&A 😄

@daixiang0
Copy link
Contributor

@cocowalla close this, feel free to continue discuss here.

cyriltovena pushed a commit to cyriltovena/loki that referenced this issue Jun 11, 2021
* Squashed 'tools/' changes from b783528..1fe184f

1fe184f Bazel rules for building gogo protobufs (grafana#123)
b917bb8 Merge pull request grafana#122 from weaveworks/fix-scope-gc
c029ce0 Add regex to match scope VMs
0d4824b Merge pull request grafana#121 from weaveworks/provisioning-readme-terraform
5a82d64 Move terraform instructions to tf section
d285d78 Merge pull request grafana#120 from weaveworks/gocyclo-return-value
76b94a4 Do not spawn subshell when reading cyclo output
93b3c0d Use golang:1.9.2-stretch image
d40728f Gocyclo should return error code if issues detected
c4ac1c3 Merge pull request grafana#114 from weaveworks/tune-spell-check
8980656 Only check files
12ebc73 Don't spell-check pki files
578904a Special-case spell-check the same way we do code checks
e772ed5 Special-case on mime type and extension using just patterns
ae82b50 Merge pull request grafana#117 from weaveworks/test-verbose
8943473 Propagate verbose flag to 'go test'.
7c79b43 Merge pull request grafana#113 from weaveworks/update-shfmt-instructions
258ef01 Merge pull request grafana#115 from weaveworks/extra-linting
e690202 Use tools in built image to lint itself
126eb56 Add shellcheck to bring linting in line with scope
63ad68f Don't run lint on files under .git
51d908a Update shfmt instructions
e91cb0d Merge pull request grafana#112 from weaveworks/add-python-lint-tools
0c87554 Add yapf and flake8 to golang build image
35679ee Merge pull request grafana#110 from weaveworks/parallel-push-errors
3ae41b6 Remove unneeded if block
51ff31a Exit on first error
0faad9f Check for errors when pushing images in parallel
74dc626 Merge pull request grafana#108 from weaveworks/disable-apt-daily
b4f1d91 Merge pull request grafana#107 from weaveworks/docker-17-update
7436aa1 Override apt daily job to not run immediately on boot
7980f15 Merge pull request grafana#106 from weaveworks/document-docker-install-role
f741e53 Bump to Docker 17.06 from CE repo
61796a1 Update Docker CE Debian repo details
0d86f5e Allow for Docker package to be named docker-ce
065c68d Document selection of Docker installation role.
3809053 Just --porcelain; it defaults to v1
11400ea Merge pull request grafana#105 from weaveworks/remove-weaveplugin-remnants
b8b4d64 remove weaveplugin remnants
35099c9 Merge pull request grafana#104 from weaveworks/pull-docker-py
cdd48fc Pull docker-py to speed tests/builds up.
e1c6c24 Merge pull request grafana#103 from weaveworks/test-build-tags
d5d71e0 Add -tags option so callers can pass in build tags
8949b2b Merge pull request grafana#98 from weaveworks/git-status-tag
ac30687 Merge pull request grafana#100 from weaveworks/python_linting
4b125b5 Pin yapf & flake8 versions
7efb485 Lint python linting function
444755b Swap diff direction to reflect changes required
c5b2434 Install flake8 & yapf
5600eac Lint python in build-tools repo
0b02ca9 Add python linting
c011c0d Merge pull request grafana#79 from kinvolk/schu/python-shebang
6577d07 Merge pull request grafana#99 from weaveworks/shfmt-version
00ce0dc Use git status instead of diff to add 'WIP' tag
411fd13 Use shfmt v1.3.0 instead of latest from master.
0d6d4da Run shfmt 1.3 on the code.
5cdba32 Add sudo
c322ca8 circle.yml: Install shfmt binary.
e59c225 Install shfmt 1.3 binary.
30706e6 Install pyhcl in the build container.
960d222 Merge pull request grafana#97 from kinvolk/alban/update-shfmt-3
1d535c7 shellcheck: fix escaping issue
5542498 Merge pull request grafana#96 from kinvolk/alban/update-shfmt-2
32f7cc5 shfmt: fix coding style
09f72af lint: print the diff in case of error
571c7d7 Merge pull request grafana#95 from kinvolk/alban/update-shfmt
bead6ed Update for latest shfmt
b08dc4d Update for latest shfmt (grafana#94)
2ed8aaa Add no-race argument to test script (grafana#92)
80dd78e Merge pull request grafana#91 from weaveworks/upgrade-go-1.8.1
08dcd0d Please ./lint as shfmt changed its rules between 1.0.0 and 1.3.0.
a8bc9ab Upgrade default Go version to 1.8.1.
41c5622 Merge pull request grafana#90 from weaveworks/build-golang-service-conf
e8ebdd5 broaden imagetag regex to fix haskell build image
ba3fbfa Merge pull request grafana#89 from weaveworks/build-golang-service-conf
e506f1b Fix up test script for updated shfmt
9216db8 Add stuff for service-conf build to build-goland image
66a9a93 Merge pull request grafana#88 from weaveworks/haskell-image
cb3e3a2 shfmt
74a5239 Haskell build image
4ccd42b Trying circle quay login
b2c295f Merge branch 'common-build'
0ac746f Trim quay prefix in circle script
c405b31 Merge pull request grafana#87 from weaveworks/common-build
9672d7c Push build images to quay as they have sane robot accounts
a2bf112 Review feedback
fef9b7d Add protobuf tools
10a77ea Update readme
254f266 Don't need the image name in
ffb59fc Adding a weaveworks/build-golang image with tags
b817368 Update min Weave Net docker version
cf87ca3 Merge pull request grafana#86 from weaveworks/lock-kubeadm-version
3ae6919 Add example of custom SSH private key to tf_ssh's usage.
cf8bd8a Add example of custom SSH private key to tf_ansi's usage.
c7d3370 Lock kubeadm's Kubernetes version.
faaaa6f Merge pull request grafana#84 from weaveworks/centos-rhel
ef552e7 Select weave-kube YAML URL based on K8S version.
b4c1198 Upgrade default kubernetes_version to 1.6.1.
b82805e Use a fixed version of kubeadm.
f33888b Factorise and make kubeconfig option optional.
f7b8b89 Install EPEL repo for CentOS.
615917a Fix error in decrypting AWS access key and secret.
86f97b4 Add CentOS 7 AMI and username for AWS via Terraform.
eafd810 Add tf_ansi example with Ansible variables.
2b05787 Skip setup of Docker over TCP for CentOS/RHEL.
84c420b Add docker-ce role for CentOS/RHEL.
00a820c Add setup_weave-net_debug.yml playbook for user issues' debugging.
3eae480 Upgrade default kubernetes_version to 1.5.4.
753921c Allow injection of Docker installation role.
e1ff90d Fix kubectl taint command for 1.5.
b989e97 Fix typo in kubectl taint for single node K8S cluster.
541f58d Remove 'install_recommends: no' for ethtool.
c3f9711 Make Ansible role docker-from-get.docker.com work on RHEL/CentOS.
038c0ae Add frequently used OS images, for convenience.
d30649f Add --insecure-registry to docker.conf
1dd9218 shfmt -i 4 -w push-images
6de96ac Add option to not push docker hub images
310f53d Add push-images script from cortex
8641381 Add port 6443 to kubeadm join commands for K8S 1.6+.
50bf0bc Force type of K8S token to string.
08ab1c0 Remove trailing whitespaces.
ae9efb8 Enable testing against K8S release candidates.
9e32194 Secure GCP servers for Scope: open port 80.
a22536a Secure GCP servers for Scope.
89c3a29 Merge pull request grafana#78 from weaveworks/lint-merge-rebase-issue-in-docs
73ad56d Add linter function to avoid bad merge/rebase artefact
31d069d Change Python shebang to `#!/usr/bin/env python`
52d695c Merge pull request grafana#77 from kinvolk/schu/fix-relative-weave-path
77aed01 Merge pull request grafana#73 from weaveworks/mike/sched/fix-unicode-issue
7c080f4 integration/sanity_check: disable SC1090
d6d360a integration/gce.sh: update gcloud command
e8def2c provisioning/setup: fix shellcheck SC2140
cc02224 integration/config: fix weave path
9c0d6a5 Fix config_management/README.md
334708c Merge pull request grafana#75 from kinvolk/alban/external-build-1
da2505d gce.sh: template: print creation date
e676854 integration tests: fix user account
8530836 host nameing: add repo name
b556c0a gce.sh: fix deletion of gce instances
2ecd1c2 integration: fix GCE --zones/--zone parameter
3e863df sched: Fix unicode encoding issues
51785b5 Use rm -f and set current dir using BASH_SOURCE.
f5c6d68 Merge pull request grafana#71 from kinvolk/schu/fix-linter-warnings
0269628 Document requirement for `lint_sh`
9a3f09e Fix linter warnings
efcf9d2 Merge pull request grafana#53 from weaveworks/2647-testing-mvp
d31ea57 Weave Kube playbook now works with multiple nodes.
27868dd Add GCP firewall rule for FastDP crypto.
edc8bb3 Differentiated name of dev and test playbooks, to avoid confusion.
efa3df7 Moved utility Ansible Yaml to library directory.
fcd2769 Add shorthands to run Ansible playbooks against Terraform-provisioned virtual machines.
f7946fb Add shorthands to SSH into Terraform-provisioned virtual machines.
aad5c6f Mention Terraform and Ansible in README.md.
dddabf0 Add Terraform output required for templates' creation.
dcc7d02 Add Ansible configuration playbooks for development environments.
f86481c Add Ansible configuration playbooks for Docker, K8S and Weave-Net.
efedd25 Git-ignore Ansible retry files.
765c4ca Add helper functions to setup Terraform programmatically.
801dd1d Add Terraform cloud provisioning scripts.
b8017e1 Install hclfmt on CircleCI.
4815e19 Git-ignore Terraform state files.
0aaebc7 Add script to generate cartesian product of dependencies of cross-version testing.
007d90a Add script to list OS images from GCP, AWS and DO.
ca65cc0 Add script to list relevant versions of Go, Docker and Kubernetes.
aa66f44 Scripts now source dependencies using absolute path (previously breaking make depending on current directory).
7865e86 Add -p option to parallelise lint.
36c1835 Merge pull request grafana#69 from weaveworks/mflag
9857568 Use mflag and mflagext package from weaveworks/common.
9799112 Quote bash variable.
10a36b3 Merge pull request grafana#67 from weaveworks/shfmt-ignore
a59884f Add support for .lintignore.
03cc598 Don't lint generated protobuf code.
2b55c2d Merge pull request grafana#66 from weaveworks/reduce-test-timeout
d4e163c Make timeout a flag
49a8609 Reduce test timeout
8fa15cb Merge pull request grafana#63 from weaveworks/test-defaults

git-subtree-dir: tools
git-subtree-split: 1fe184f1f5330c4444c4377bef84f2d30e7dc7fe

* Use keyed fields in composite literal

* Squashed 'tools/' changes from 1fe184f..ccc8316

ccc8316 Revert "Gocyclo should return error code if issues detected" (grafana#124)

git-subtree-dir: tools
git-subtree-split: ccc831682b5d51e068b17fe9ad482f025abd1fbb
xperimental pushed a commit to xperimental/loki that referenced this issue May 4, 2023
Update from upstream repository
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants