Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator debug: write NDJSON for large collections #14610

Merged
merged 2 commits into from
Sep 22, 2022

Conversation

tgross
Copy link
Member

@tgross tgross commented Sep 16, 2022

The operator debug command writes JSON files from API responses as a single line containing an array of JSON objects. But some of these files can be extremely large (GB's) for large production clusters, which makes it difficult to parse them using typical line-oriented Unix command line tools that can stream their inputs without consuming a lot of memory.

For collections that are typically large, instead emit newline-delimited JSON.

This changeset includes some first-pass refactoring of this command. It breaks up monolithic methods that validate a path, create a file, serialize objects, and write them to disk into smaller functions, some of which can now be standalone to take advantage of generics.


There are mostly likely a bunch more improvements we can make here to memory usage, including using pagination so we're not beating on the servers so much (although that makes the CLI more sensitive to server version). But also I didn't want to try to rework the whole darn thing at once 😀

The `operator debug` command writes JSON files from API responses as a single
line containing an array of JSON objects. But some of these files can be
extremely large (GB's) for large production clusters, which makes it difficult
to parse them using typical line-oriented Unix command line tools that can
stream their inputs without consuming a lot of memory.

For collections that are typically large, instead emit newline-delimited JSON.

This changeset includes some first-pass refactoring of this command. It breaks
up monolithic methods that validate a path, create a file, serialize objects,
and write them to disk into smaller functions, some of which can now be
standalone to take advantage of generics.
@tgross tgross added this to the 1.4.0 milestone Sep 16, 2022
@tgross tgross marked this pull request as ready for review September 16, 2022 19:36
@tgross tgross changed the title operator debug: write NJSON for large collections operator debug: write NDJSON for large collections Sep 16, 2022
Copy link
Contributor

@DerekStrickland DerekStrickland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

filePath := filepath.Join(dirPath, filename)

// Ensure parent directories exist
err := os.MkdirAll(dirPath, os.ModePerm)
Copy link
Member

@shoenig shoenig Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is https://github.com/hashicorp/nomad/blob/main/helper/escapingfs/escapes.go#L116

Which also sets permissions to a more reasonable 0755; os.ModePerm is just a mask of permission bits, resulting in 0777.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5423f92, and I've gone thru the rest of the file and fixed the other uses cases of the same thing.

(I also fixed all the places we were doing \"%s\" in error messages while I was in there 😁 )

if err != nil {
return fmt.Errorf("failed to write to file: %w", err)
}
_, err = writer.Write([]byte{'\n'})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does NDJSON say anything about newlines on Windows?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the \r is a "MAY" according to the spec, but my thinking here is that while it's entirely possible an end-user is creating the debug bundle from a Window OS, the bundle is most likely to be debugged on a non-Windows OS and having that extra \r is just going to add pain to the debugging workflow.

ref https://github.com/ndjson/ndjson-spec#31-serialization

Each JSON text MUST conform to the [RFC7159] standard and MUST be written to the stream followed by the newline character \n (0x0A). The newline character MAY be preceded by a carriage return \r (0x0D). The JSON texts MUST NOT contain newlines or carriage returns.

Copy link
Member

@shoenig shoenig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tgross tgross merged commit 349501f into main Sep 22, 2022
@tgross tgross deleted the cli-operator-debug-ndjson branch September 22, 2022 14:02
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants