Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snapshot restore-from-archive streaming and filtering #13658

Merged
merged 4 commits into from
Jul 11, 2022

Conversation

tgross
Copy link
Member

@tgross tgross commented Jul 8, 2022

This changeset implements two improvements to restoring FSM snapshots from archives:

  • The existing implementation decompresses the archive to a temporary file before reading it in to the FSM. For large snapshots this performs a lot of disk IO. Stream decompress the snapshot as we read it, without first writing to a temporary file. This also moves some of the work to a second core.
  • Add bexpr filters to the RestoreFromArchive helper. The operator can pass these as -filter arguments to nomad operator snapshot state (and other commands in the future) to include only desired data when reading the snapshot.

Deferred for this PR: the nomad operator snapshot state command still has to load everything that's been filtered into the FSM before writing it out to a large JSON blob. We should provide a tool that streams the decoded objects directly to an encoder without loading into the FSM, so that we can emit NDJSON, write out to a sqlite DB, etc.


Example:

Starting with a 439MB snapshot (~13GiB uncompressed), I want to filter for all objects associated with 3 different jobs and 3 different nodes:

time nomad operator snapshot state -filter '
    JobID == "job1" or
    JobID == "job2" or
    JobID == "job3" or
    NodeID == "3b3471d7-c519-8e3c-d7fd-dc692ca44744" or
    NodeID == "455775de-b4b4-0cb6-75eb-6c534618a005" or
    NodeID == "0d8e2a62-2712-cb4c-fb15-9831fdac57fe" or
    ID == "job1" or
    ID == "job2" or
    ID == "job3" or
    ID == "3b3471d7-c519-8e3c-d7fd-dc692ca44744" or
    ID == "455775de-b4b4-0cb6-75eb-6c534618a005" or
    ID == "0d8e2a62-2712-cb4c-fb15-9831fdac57fe"
' \
      ./nomad_operator_snapshot_save_2022_05_12_1543-0700.snap \
      > filtered-state.json

real    24m15.805s
user    29m55.036s
sys     7m50.618s

$ cat filtered-state.json| jq '.Allocs | length'
5490
$ cat filtered-state.json| jq '.Evals | length'
667

Previously this would write ~13GiB to disk, read 14GiB from disk, and saturate 1 core for over an hour before running out of memory on my machine (16GiB) and crashing.

With this change, the command reads ~450MiB from disk, only writes the 197MiB JSON blob to disk, and uses about 150% CPU, maxing out memory usage around 330MB.

The `RestoreFromArchive` helper decompresses the snapshot archive to a
temporary file before reading it into the FSM. For large snapshots
this performs a lot of disk IO. Stream decompress the snapshot as we
read it, without first writing to a temporary file.
The operator can pass these as `-filter` arguments to `nomad operator
snapshot state` (and other commands in the future) to include only
desired data when reading the snapshot.
Copy link
Member

@shoenig shoenig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

.changelog/13658.txt Outdated Show resolved Hide resolved
helper/raftutil/snapshot.go Outdated Show resolved Hide resolved
nomad/fsm.go Outdated Show resolved Hide resolved
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants