Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clp-s: Add option to record metadata about decompressed archive chunks. #485

Merged
merged 7 commits into from
Jul 23, 2024

Conversation

gibber9809
Copy link
Contributor

@gibber9809 gibber9809 commented Jul 18, 2024

Description

This PR introduces the option to record metadata about decompressed archive chunks to mongodb for the purpose of log viewing. The changes are localized to CommandLineArguments and JsonConstructor.

We record the start and end timestamp of each chunk in the start and end index fields of the metadata since that is the information that will be used to look up and order chunks for viewing in the context of clp-s.

Validation performed

  • Validated that all decompression modes still work correctly when metadata not recorded
  • Validated that errors are reported as expected with invalid mongodb uri and collection arguments
  • Validated that expected metadata gets written to mongodb when valid uri and collection are passed

@@ -349,6 +364,18 @@ CommandLineArguments::parse_arguments(int argc, char const** argv) {
throw std::invalid_argument("ordered-chunk-size must be used with ordered argument"
);
}

if (m_mongodb_uri.empty() ^ m_mongodb_collection.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a brief comment on why you use "xor"? like "either be both unspecified, or must be both specified"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add a comment

@@ -15,5 +15,14 @@ constexpr char cArchiveArrayDictFile[] = "/array.dict";
constexpr char cArchiveLogDictFile[] = "/log.dict";
constexpr char cArchiveTimestampDictFile[] = "/timestamp.dict";
constexpr char cArchiveVarDictFile[] = "/var.dict";

namespace results_cache { namespace decompression {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, and the way that would probably make sense if we did want to share this stuff between clo/clp-s is to just have a utility class that creates the required BSON objects and hides these field names. Right now we do take some classes from clp, but it's in a kind of nasty way where we have to manually add all of the required files to our build target. Really what we should do is have a build target that builds a library of utilities that all of our different binaries can use.

Since that's a bit of a bigger refactor I wanted to avoid including it in this PR. Plus the namespace styles used by clo are inconsistent with what we do here, and I want to avoid mixing styles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@gibber9809 gibber9809 requested a review from haiqi96 July 21, 2024 19:04
@haiqi96
Copy link
Contributor

haiqi96 commented Jul 22, 2024

can you please also test and make sure that when a valid mongodb uri is given, clp-s can write the fields properly to the collection?

haiqi96
haiqi96 previously approved these changes Jul 22, 2024
Copy link
Contributor

@haiqi96 haiqi96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

components/core/src/clp_s/clp-s.cpp Show resolved Hide resolved
components/core/src/clp_s/archive_constants.hpp Outdated Show resolved Hide resolved
components/core/src/clp_s/CommandLineArguments.cpp Outdated Show resolved Hide resolved
components/core/src/clp_s/CommandLineArguments.cpp Outdated Show resolved Hide resolved
components/core/src/clp_s/CommandLineArguments.cpp Outdated Show resolved Hide resolved
components/core/src/clp_s/CommandLineArguments.cpp Outdated Show resolved Hide resolved
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing

components/core/src/clp_s/JsonConstructor.hpp Outdated Show resolved Hide resolved
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
@gibber9809 gibber9809 requested a review from kirkrodrigues July 23, 2024 18:55
@haiqi96
Copy link
Contributor

haiqi96 commented Jul 23, 2024

looks good to me

@kirkrodrigues kirkrodrigues changed the title clp-s: Add option to record metadata about decompressed archive chunks clp-s: Add option to record metadata about decompressed archive chunks. Jul 23, 2024
@gibber9809 gibber9809 merged commit 2a6218e into y-scope:main Jul 23, 2024
13 checks passed
jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024
…s. (y-scope#485)

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants