Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

Add WALSegmentSize as an option of tsdb creation #450

Merged
merged 2 commits into from
Dec 18, 2018

Conversation

glutamatt
Copy link
Contributor

@glutamatt glutamatt commented Nov 12, 2018

Problem to solve :

Prometheus server is running on a raspberry pi :
In order to preserve the SD Card lifetime, I mounted the WAL directory on the tmpfs (memory) partition.

The tmpfs partition is quickly full.

Solution :

Provides the wal semgent file size as an option via wal.Options struct

Results

Then, I will be able to limit the wal segment file sizes ( I set 1048576 for example ). Smaller files are written, and the oldest ones are truncated : the wal directory keep an acceptable size on tmpfs partition.

I prepared an associated patch on prometheus server project

It works perfectly on my raspberry now ✨

output of the prometheus server on the raspberry when it crashed
level=warn ts=2018-11-10T03:17:09.556470872Z caller=scrape.go:804 component="scrape manager" scrape_pool=teleinfo target=http://localhost:2112/metrics msg="append failed" err="write to WAL: log samples: write /var/prometheus/data/wal/00000000: no space left on device"
panic: runtime error: slice bounds out of range

goroutine 180 [running]:
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.(*WAL).log(0x33eaa20, 0x3d40000, 0x39, 0x400, 0x1, 0xfb9f8d30, 0x166)
        /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal/wal.go:480 +0x3ac
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.(*WAL).Log(0x33eaa20, 0x39ebd54, 0x1, 0x1, 0x0, 0x0)
        /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal/wal.go:450 +0xc8
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*headAppender).log(0x344f9c0, 0x0, 0x0)
        /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:678 +0x158
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*headAppender).Commit(0x344f9c0, 0x0, 0x0)
        /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:689 +0xac
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.dbAppender.Commit(0x14f6228, 0x344f9c0, 0x30a2f50, 0x40080000, 0x0)
        /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/db.go:349 +0x24
github.com/prometheus/prometheus/storage/tsdb.appender.Commit(0x14f6468, 0x3c9db20, 0x0, 0xfb9f8d30)
        /go/src/github.com/prometheus/prometheus/storage/tsdb/tsdb.go:258 +0x24
github.com/prometheus/prometheus/storage.(*fanoutAppender).Commit(0x3bdfc20, 0x3c9db30, 0x0)
        /go/src/github.com/prometheus/prometheus/storage/fanout.go:161 +0x28
github.com/prometheus/prometheus/scrape.(*scrapeLoop).report(0x3f82640, 0x60f163df, 0xbef1b14d, 0x68485c3d, 0x5d7c, 0x233d9b0, 0x4006ce, 0x0, 0x3, 0x3, ...)
        /go/src/github.com/prometheus/prometheus/scrape/scrape.go:1127 +0x3b8
github.com/prometheus/prometheus/scrape.(*scrapeLoop).run(0x3f82640, 0x77359400, 0x0, 0x77359400, 0x0, 0x0)
        /go/src/github.com/prometheus/prometheus/scrape/scrape.go:818 +0x650
created by github.com/prometheus/prometheus/scrape.(*scrapePool).sync
        /go/src/github.com/prometheus/prometheus/scrape/scrape.go:329 +0x384

@krasi-georgiev
Copy link
Contributor

That is strange the defaultSegmentSize should not affect the overall WAL folder size, but only when a new segment is started.
The size of the WAL folder is determined by the amount of metrics you are scraping and the block time time range.

so in your case changing the defaultSegmentSize to a smaller value should result in smaller files sizes, but more more overall segments.

For example if before the change you had 10x128mb = 1280mb after changing the default size to 12mb should end up with something like: 107x12mb = 1280mb

@glutamatt
Copy link
Contributor Author

The size of the WAL folder is determined by the amount of metrics you are scraping and the block time time range.

before my patch, there was only one file in the wal folder /var/prometheus/data/wal/00000000
the file was growing (i guess until the default size but it crashed at 20M, the tmpfs partition size)

After the patch (and the config) : the new folders (NOT the wal) with chunks are written as expected each 2 hours , with just one file 000001 in the chunks subdirectory. Perfectly fine. I guess the tsdb block duration settings do their jobs.

The patch+config only affect the segments size of the WAL, and how many are at the same time.
So in the wal directory, the smaller the segments are, the more they are numerous and rotating (thanks to the db.reaload -> head.Truncat -> Checkpoints calls)

@glutamatt
Copy link
Contributor Author

How i test :

i go to my patched branch

/tmp/prometheus.yml

global:
  scrape_interval:     5s
  evaluation_interval: 5s

scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ['localhost:9090']
  - job_name: prometheus2
    static_configs:
      - targets: ['localhost:9090']
  - job_name: prometheus3
.... i copy and past jobs to scrap more metrics by seconds
go run cmd/prometheus/main.go  --storage.tsdb.wal-segment-size=32768 --storage.tsdb.max-block-duration=35s --storage.tsdb.min-block-duration=30s  --config.file=/tmp/prometheus.yml
cd data
watch --interval 1 'tree -suh | tail -n 50'

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Nov 12, 2018

yeah that is true, the checkpointing will be triggered more often which would clear the obsolete samples more often.

@gouthamve , @fabxc can you think of any negative side effects of making this configurable?

@glutamatt btw I don't see any reason why you should expose the defaultSegmentSize constant.
Also don't forget that using the tmpfs as WAL means a restart of the RPI would cause all data from the last block to be lost.

@glutamatt
Copy link
Contributor Author

why you should expose the defaultSegmentSize constant

https://github.com/prometheus/tsdb/pull/450/files#diff-7a117cbfd5485ed35efcbb2995c37aa0R48

Also don't forget that using the tmpfs as WAL means a restart of the RPI would cause all data from the last block to be lost.

Sure, but i prefer to lose some data on a reboot rather than burning my sd card (a changing/buying an other one will lead to more "unpersisted" data 🤡)

@glutamatt
Copy link
Contributor Author

I don't see any reason why you should expose the defaultSegmentSize constant

I preferred to be fully backward compatible : In db.go file, we're in the tsdb package. So I expose the constant to set the default option or use it if the option is let to zero.

maybe if someone is interested of the project of my use case

@krasi-georgiev
Copy link
Contributor

Making the wal segment configurable will be very useful for tests.

@glutamatt
Copy link
Contributor Author

Hi @krasi-georgiev ,
is there something i could help to bring this update to be merged ?
thank you in advance

@krasi-georgiev
Copy link
Contributor

@glutamatt sorry for the delay. We started maintaining a change file so do you mind adding an entry for the API change and the new exposed WALSegmentSize option.

https://github.com/prometheus/tsdb/blob/master/CHANGELOG.md

@glutamatt
Copy link
Contributor Author

adding an entry for the API change and the new exposed WALSegmentSize option

Done ! 👌

If you think about a better wording or syntax, I would be fine to update the changelog

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@glutamatt
Copy link
Contributor Author

glutamatt commented Dec 3, 2018

suggestions are commited and rebased with sign off ✅
thanks

@krasi-georgiev
Copy link
Contributor

sorry to be a pain but I just remembered that we need to add a test for this new behaviour.

Something like:
Open a new db with the smallest WALSegmentSize that makes sense(to keep the total test time short).
Start writing to the wal and ensure that wal files get split at the WALSegmentSize.

If you get stuck with the test ping me and I will try to help.

@glutamatt
Copy link
Contributor Author

sorry to be a pain but I just remembered that we need to add a test for this new behaviour.

👍

Something like:
Open a new db with the smallest WALSegmentSize that makes sense(to keep the total test time short).
Start writing to the wal and ensure that wal files get split at the WALSegmentSize.

If you get stuck with the test ping me and I will try to help.

here we are :
https://github.com/prometheus/tsdb/pull/450/files#diff-65348100fd3ccd8cb8fb4f6bad6bc2d9R698

Copy link
Contributor

@krasi-georgiev krasi-georgiev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot, just few minor nits.

db_test.go Outdated Show resolved Hide resolved
db_test.go Outdated Show resolved Hide resolved
@glutamatt
Copy link
Contributor Author

minor nits

minor nits done, you're welcome

@krasi-georgiev
Copy link
Contributor

yep, all good, thanks for the patience.

I just a did the 0.3.0 release and will wait for this to get merged in Prometheus and after that will merge your PR as well.

Copy link
Contributor

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small change

db.go Outdated Show resolved Hide resolved
Signed-off-by: Glutamatt <glutamatt@live.fr>

Add unit test `TestWALSegmentSizeOption`

Signed-off-by: Glutamatt <glutamatt@live.fr>

code review feedback

Signed-off-by: Glutamatt <glutamatt@live.fr>
@krasi-georgiev krasi-georgiev merged commit 22e3aeb into prometheus-junkyard:master Dec 18, 2018
@krasi-georgiev
Copy link
Contributor

@glutamatt thanks,
Ping me when ready with the Prometheus PR.

@glutamatt glutamatt deleted the draft branch December 19, 2018 11:23
@glutamatt
Copy link
Contributor Author

Ping me when ready with the Prometheus PR.

do i wait for you to tag a new minor of tsdb, before creating a PR in prometheus repo ?
do i start a PR, with go.mod requiring the last master commit hash of tsdb ?

if none of those, could you tell me how to proceed please @krasi-georgiev , thanks !

@krasi-georgiev
Copy link
Contributor

I think using a hash would be best. no need to block until the next release.

@glutamatt
Copy link
Contributor Author

Let's have a first use of the brand new go dependency manager. Thanks for the fast answer

@krasi-georgiev
Copy link
Contributor

thanks for your patience and great work!

@krasi-georgiev
Copy link
Contributor

@glutamatt btw Simon did a very good step by step section how to update a module:
https://github.com/prometheus/prometheus/blob/master/CONTRIBUTING.md#dependency-management

@glutamatt
Copy link
Contributor Author

Ping me when ready with the Prometheus PR.

here it is

🔔🎄🍪 prometheus/prometheus#5029

krasi-georgiev pushed a commit to prometheus/prometheus that referenced this pull request Jan 3, 2019
Add WALSegmentSize as an option, and the corresponding flag "storage.tsdb.wal-segment-size" to tune the max size of wal segment files.

The addressed base problem is to reduce the disk space used by wal segment files : on a raspberry pi, for instance, we often want to reduce write load of the sd card, then, the wal directory is mounted on a memory (space limited) partition.

the default value of the segment max file size, pushed the size of directory to 128 MB for each segment , which is too much ram consumption on a rasp.

the initial discussion is at prometheus-junkyard/tsdb#450
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants