-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operator debug: write NDJSON for large collections #14610
Conversation
The `operator debug` command writes JSON files from API responses as a single line containing an array of JSON objects. But some of these files can be extremely large (GB's) for large production clusters, which makes it difficult to parse them using typical line-oriented Unix command line tools that can stream their inputs without consuming a lot of memory. For collections that are typically large, instead emit newline-delimited JSON. This changeset includes some first-pass refactoring of this command. It breaks up monolithic methods that validate a path, create a file, serialize objects, and write them to disk into smaller functions, some of which can now be standalone to take advantage of generics.
e5e9cf6
to
cc08926
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
command/operator_debug.go
Outdated
filePath := filepath.Join(dirPath, filename) | ||
|
||
// Ensure parent directories exist | ||
err := os.MkdirAll(dirPath, os.ModePerm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is https://github.com/hashicorp/nomad/blob/main/helper/escapingfs/escapes.go#L116
Which also sets permissions to a more reasonable 0755
; os.ModePerm
is just a mask of permission bits, resulting in 0777
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 5423f92, and I've gone thru the rest of the file and fixed the other uses cases of the same thing.
(I also fixed all the places we were doing \"%s\"
in error messages while I was in there 😁 )
if err != nil { | ||
return fmt.Errorf("failed to write to file: %w", err) | ||
} | ||
_, err = writer.Write([]byte{'\n'}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does NDJSON say anything about newlines on Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the \r
is a "MAY" according to the spec, but my thinking here is that while it's entirely possible an end-user is creating the debug bundle from a Window OS, the bundle is most likely to be debugged on a non-Windows OS and having that extra \r
is just going to add pain to the debugging workflow.
ref https://github.com/ndjson/ndjson-spec#31-serialization
Each JSON text MUST conform to the [RFC7159] standard and MUST be written to the stream followed by the newline character \n (0x0A). The newline character MAY be preceded by a carriage return \r (0x0D). The JSON texts MUST NOT contain newlines or carriage returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
The
operator debug
command writes JSON files from API responses as a single line containing an array of JSON objects. But some of these files can be extremely large (GB's) for large production clusters, which makes it difficult to parse them using typical line-oriented Unix command line tools that can stream their inputs without consuming a lot of memory.For collections that are typically large, instead emit newline-delimited JSON.
This changeset includes some first-pass refactoring of this command. It breaks up monolithic methods that validate a path, create a file, serialize objects, and write them to disk into smaller functions, some of which can now be standalone to take advantage of generics.
There are mostly likely a bunch more improvements we can make here to memory usage, including using pagination so we're not beating on the servers so much (although that makes the CLI more sensitive to server version). But also I didn't want to try to rework the whole darn thing at once 😀