Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clp-s: Add support for projecting of a subset of columns during search. #510

Merged
merged 16 commits into from
Oct 7, 2024

Conversation

gibber9809
Copy link
Contributor

@gibber9809 gibber9809 commented Aug 9, 2024

Description

This PR adds support for basic projection during search. We allow the user to specify which columns they want to be marshalled for matching search results using the --projection command line option.

For example
clp-s s archives '*' --archive-id XYZ --projection @timestamp level msg.error
Would return all records in the archive XYZ, showing only the @timestamp, level, and msg.error fields.

Note that if a record doesn't contain all of columns we are asked to project we simply output the record without doing anything about the missing columns. In our example above if a record were missing all three of the fields we asked for it would simply be marshalled as {}.

Other things to note are that we allow projected columns to be polymorphic, we do not allow the set of projected columns to contain wildcards, and we do not allow projection to travel inside of arrays. Specifically if you have an array a containing objects with the field b then asking to project "a" is legal (leading to the whole array being marshalled), but asking for "a.b" is illegal.

Also note that unlike languages like SQL the names of columns being projected are case sensitive.

We can add support for things like case-insensitive column matching, advanced handling for records that don't contain a column we're asking for, polymorphism, and more in future PRs.

The implementation is quite straightforward. We take in a set of columns, expand them against the MPT to find all matching nodes, and then check if nodes in the schema of an ERT belong to that set when deciding whether to marshal the corresponding column.

Validation performed

  • validated that requested set of columns gets marshalled
  • validated that empty JSON object is produced when matching record contains none of the requested columns
  • manually validated that projected data matches expected data for small test cases
  • measured significant speedup when asking for columns representing a small fraction of the data

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a command-line option --projection for specifying projected columns.
    • Added functionality to handle projections in search operations, allowing users to define which columns to return.
    • New Projection class for managing and resolving column projections.
  • Bug Fixes

    • Improved logic in JSON document handling to ensure proper closure.
  • Enhancements

    • Updated ArchiveReader and SchemaReader to utilize projection information for schema processing.
    • Enhanced methods for accessing and managing projection columns in relevant classes.

@gibber9809 gibber9809 requested a review from wraymo August 11, 2024 23:32
@gibber9809
Copy link
Contributor Author

Seems like lint-check (ubuntu-latest) is failing for reasons unrelated to this PR. It appears to be complaining about formatting in a bunch of clp and glt files.

@kirkrodrigues
Copy link
Member

Seems like lint-check (ubuntu-latest) is failing for reasons unrelated to this PR. It appears to be complaining about formatting in a bunch of clp and glt files.

Issue being tracked in #545. You can merge main to temporarily resolve the (new) violations.

Copy link
Contributor

@wraymo wraymo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Do you want to point out --projection can only be used after all positional options or do you want to support something like ./clp-s s --projection a b c -- archive query? Maybe we should also add a projection case in the doc later.

* rewrite a simpler version. Probably best to write the simpler version for now since types are
* leaf-only. The type-per-token idea solves this problem (in the absence of wildcards).
*/
void resolve_columns(std::shared_ptr<SchemaTree> tree);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a description for this method?

}

private:
void resolve_column(std::shared_ptr<SchemaTree> tree, std::shared_ptr<ColumnDescriptor> column);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for this method.

Comment on lines 190 to 192
std::shared_ptr<search::Projection> m_projection{
new search::Projection{search::ProjectionMode::ReturnAllColumns}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using make_shared?

components/core/src/clp_s/search/Projection.cpp Outdated Show resolved Hide resolved
/**
* Add a column to the set of columns that should be included in the projected results
* @param column
* @throws OperationFailed if column contains a wildcard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more throw cases?

Comment on lines 428 to 433
)(
"projection",
po::value<std::vector<std::string>>(&m_projection_columns)
->multitoken()
->value_name("COLUMN_A COLUMN_B ..."),
"The set of projected columns that will be marshalled for matching results"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reduce the indent of this code block?

Co-authored-by: wraymo <37269683+wraymo@users.noreply.github.com>
Copy link
Contributor

coderabbitai bot commented Sep 25, 2024

Walkthrough

The changes introduce enhancements to the ArchiveReader, SchemaReader, and CommandLineArguments classes, primarily focusing on the handling of projections. A new Projection class is added to manage column projections, allowing for selective data retrieval based on user-defined criteria. The command-line interface is updated to support a new --projection option, enabling users to specify projected columns. Additionally, modifications are made to various methods to integrate projection handling into the schema reading and serialization processes.

Changes

Files Change Summary
components/core/src/clp_s/ArchiveReader.cpp, ArchiveReader.hpp Added m_projection parameter to initialize_schema_reader and introduced set_projection method and m_projection variable.
components/core/src/clp_s/CommandLineArguments.cpp, CommandLineArguments.hpp Added --projection command-line option and get_projection_columns method, along with m_projection_columns variable.
components/core/src/clp_s/JsonSerializer.hpp Updated end_document method to improve JSON closing logic.
components/core/src/clp_s/SchemaReader.cpp, SchemaReader.hpp Modified initialize_serializer and reset methods to incorporate projection handling with a new m_projection variable.
components/core/src/clp_s/clp-s.cpp Introduced functionality for handling projections in search_archive function based on command-line arguments.
components/core/src/clp_s/search/Projection.cpp, Projection.hpp Implemented Projection class with methods for managing and resolving column projections, including error handling.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@gibber9809 gibber9809 requested a review from wraymo September 25, 2024 16:23
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (13)
components/core/src/clp_s/search/Projection.hpp (2)

13-16: LGTM: ProjectionMode enum is well-defined.

The enum definition is clear and concise. Using uint8_t as the underlying type is appropriate for this small enum.

Consider adding a brief comment explaining the purpose of each enum value for improved clarity:

enum ProjectionMode : uint8_t {
    ReturnAllColumns,     // Return all columns regardless of selection
    ReturnSelectedColumns // Return only the columns specified in the projection
};

62-64: LGTM: Private members are well-chosen and initialized.

The use of std::vector for selected columns and absl::flat_hash_set for matching nodes is appropriate and efficient. The default projection mode of ReturnAllColumns aligns with the expected behavior.

Consider making the m_projection_mode member const since it's initialized in the constructor and doesn't appear to change:

const ProjectionMode m_projection_mode;
components/core/src/clp_s/JsonSerializer.hpp (1)

78-84: Improved robustness for closing JSON documents

The changes to the end_document() method enhance its ability to handle various scenarios, including empty documents or improperly started ones. This is a good improvement in terms of robustness.

However, we could optimize this slightly by using a ternary operator:

void end_document() {
    m_json_string.back() != '{' ? m_json_string.back() = '}' : m_json_string += '}';
}

This achieves the same result with a more concise syntax, potentially improving readability and performance.

components/core/src/clp_s/ArchiveReader.hpp (1)

190-192: Consider using std::make_shared for initialization.

While the current initialization is correct, using std::make_shared could potentially be more efficient:

std::shared_ptr<search::Projection> m_projection{
    std::make_shared<search::Projection>(search::ProjectionMode::ReturnAllColumns)
};

This approach performs a single allocation for both the Projection object and the control block, which can be more efficient than separate allocations.

components/core/src/clp_s/CommandLineArguments.hpp (1)

197-197: Consider initializing m_projection_columns in the constructor.

The new m_projection_columns member variable is appropriately typed and named, following the class's conventions. However, it's not initialized in the constructor. While this might be intentional, consider adding an empty initialization in the constructor to ensure consistent behaviour across all instances of the class.

Example:

CommandLineArguments(std::string const& program_name)
    : m_program_name(program_name), m_projection_columns() {}

This would guarantee that m_projection_columns starts as an empty vector, preventing any potential issues with uninitialized data.

components/core/src/clp_s/SchemaReader.hpp (1)

75-83: LGTM: Updated reset method signature

The addition of the projection parameter to the reset method signature is appropriate and aligns with the PR objectives to support column projection.

Suggestion: Consider updating the method documentation to include a brief description of the projection parameter.

components/core/src/clp_s/SchemaReader.cpp (2)

560-562: Good extension of projection to unordered objects

This change correctly extends the projection feature to unordered objects in the schema, maintaining consistency in applying user-specified column projections. Well done!

For improved code consistency, consider extracting the matches_node check into a separate method, as it's used in multiple places.

Here's a suggested refactor:

+bool should_include_in_local_tree(int32_t node_id) {
+    return m_projection->matches_node(node_id);
+}

 void SchemaReader::initialize_serializer() {
     // ...
     for (int32_t global_column_id : m_ordered_schema) {
-        if (m_projection->matches_node(global_column_id)) {
+        if (should_include_in_local_tree(global_column_id)) {
             generate_local_tree(global_column_id);
         }
     }

     for (auto it = m_global_id_to_unordered_object.begin();
          it != m_global_id_to_unordered_object.end();
          ++it)
     {
-        if (m_projection->matches_node(it->first)) {
+        if (should_include_in_local_tree(it->first)) {
             generate_local_tree(it->first);
         }
     }
     // ...
 }

This refactoring improves readability and makes it easier to modify the inclusion logic in the future if needed.


567-569: Good defensive programming practice

This change prevents attempting to generate a JSON template for an empty schema tree, which is an excellent defensive programming practice. It avoids potential errors and improves the robustness of the code.

To improve clarity, consider adding a brief comment explaining why this check is necessary:

+    // Only generate JSON template if we have nodes in the local schema tree
+    // This can happen if no columns match the projection
     if (false == m_local_schema_tree.get_nodes().empty()) {
         generate_json_template(m_local_schema_tree.get_root_node_id());
     }

This comment would help future maintainers understand the purpose of this check quickly.

components/core/src/clp_s/CommandLineArguments.cpp (1)

428-433: LGTM! Consider clarifying the description.

The implementation of the new --projection option looks good. It correctly allows for multiple column names and stores them appropriately.

Consider slightly modifying the description to be more explicit about its purpose:

-                    "The set of projected columns that will be marshalled for matching results"
+                    "Specify columns to include in the output for matching results"

This change makes it clearer that the option affects the output content rather than the matching process itself.

components/core/src/clp_s/search/Projection.cpp (2)

38-53: Consider moving extensive comments to documentation

The block comment from lines 38 to 53 provides valuable context but is quite lengthy. Consider moving this explanation to external documentation or a detailed comment in the header file to keep the codebase clean and maintainable.


66-66: Simplify boolean expression for enhanced readability

Using !last_token instead of false == last_token improves clarity.

Apply this diff:

-       if (false == last_token && child_node.get_type() != NodeType::Object) {
+       if (!last_token && child_node.get_type() != NodeType::Object) {
components/core/src/clp_s/clp-s.cpp (2)

198-198: Enhance Error Logging for Clarity

The current error message may lack context, making it harder to diagnose issues.

Consider adding more descriptive information to the error log to assist with debugging:

-SPDLOG_ERROR("{}", e.what());
+SPDLOG_ERROR("Failed to add projection columns: {}", e.what());

This change provides clearer insight into what operation was being performed when the error occurred.


185-189: Simplify the Initialization of projection Object

The ternary operator used for determining the ProjectionMode can be simplified for better readability.

Consider simplifying the initialization as follows:

 auto projection = std::make_shared<Projection>(
-        command_line_arguments.get_projection_columns().empty()
-                ? ProjectionMode::ReturnAllColumns
-                : ProjectionMode::ReturnSelectedColumns
+    command_line_arguments.get_projection_columns().empty() ?
+        ProjectionMode::ReturnAllColumns :
+        ProjectionMode::ReturnSelectedColumns
 );

Alternatively, assign the mode to a variable for clarity:

ProjectionMode mode = command_line_arguments.get_projection_columns().empty() ?
    ProjectionMode::ReturnAllColumns :
    ProjectionMode::ReturnSelectedColumns;
auto projection = std::make_shared<Projection>(mode);

This enhances readability by breaking down the logic into manageable parts.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 3ceb17e and 8b3dbfa.

📒 Files selected for processing (11)
  • components/core/src/clp_s/ArchiveReader.cpp (1 hunks)
  • components/core/src/clp_s/ArchiveReader.hpp (3 hunks)
  • components/core/src/clp_s/CMakeLists.txt (1 hunks)
  • components/core/src/clp_s/CommandLineArguments.cpp (1 hunks)
  • components/core/src/clp_s/CommandLineArguments.hpp (2 hunks)
  • components/core/src/clp_s/JsonSerializer.hpp (1 hunks)
  • components/core/src/clp_s/SchemaReader.cpp (1 hunks)
  • components/core/src/clp_s/SchemaReader.hpp (4 hunks)
  • components/core/src/clp_s/clp-s.cpp (3 hunks)
  • components/core/src/clp_s/search/Projection.cpp (1 hunks)
  • components/core/src/clp_s/search/Projection.hpp (1 hunks)
🧰 Additional context used
cppcheck
components/core/src/clp_s/clp-s.cpp

[error] 191-191: Exception thrown in function declared not to throw exceptions.

(throwInNoexceptFunction)

🔇 Additional comments not posted (20)
components/core/src/clp_s/search/Projection.hpp (5)

1-12: LGTM: File structure and includes are well-organized.

The file header, include guards, and necessary includes are properly set up. The namespace is correctly defined, following best practices for C++ code organization.


24-39: Enhance exception documentation and consider additional throw cases.

The OperationFailed exception is well-defined, but consider expanding on potential throw cases in the add_column method documentation.

Update the method documentation to include more specific throw cases:

/**
 * Add a column to the set of columns that should be included in the projected results
 * @param column The column descriptor to add
 * @throws OperationFailed if:
 *   - column contains a wildcard
 *   - column is null
 *   - column name is empty
 *   - column name is duplicated
 */
void add_column(std::shared_ptr<ColumnDescriptor> column);

40-47: Add method description for resolve_columns.

As per the previous review comment, please add a description for this method to improve code documentation.

Add a method description:

/**
 * Resolve the added columns against the provided schema tree.
 * This method should be called after adding all desired columns and before calling matches_node.
 * @param tree The schema tree to resolve columns against
 * @throws OperationFailed if resolution fails for any column
 */
void resolve_columns(std::shared_ptr<SchemaTree> tree);

59-60: Add method description for resolve_column.

As per the previous review comment, please add a description for this private method to improve code maintainability.

Add a method description:

/**
 * Resolve a single column against the provided schema tree.
 * @param tree The schema tree to resolve the column against
 * @param column The column descriptor to resolve
 * @throws OperationFailed if resolution fails for the column
 */
void resolve_column(std::shared_ptr<SchemaTree> tree, std::shared_ptr<ColumnDescriptor> column);

49-57: LGTM: matches_node method is well-implemented.

The matches_node method is efficiently implemented as an inline function. It correctly checks the projection mode and uses the hash set for fast matching, which aligns well with the PR objectives of allowing users to specify which columns to include in the search results.

components/core/src/clp_s/ArchiveReader.hpp (2)

15-15: LGTM: Appropriate include added for new functionality.

The inclusion of "search/Projection.hpp" is necessary and correctly placed for the new projection functionality being introduced.


137-139: LGTM: Well-designed setter for projection.

The set_projection method is appropriately implemented:

  • It uses std::shared_ptr for managing shared ownership of the projection object.
  • The method is concise and follows the Single Responsibility Principle.
components/core/src/clp_s/CMakeLists.txt (1)

148-149: LGTM! The new Projection files are correctly added.

The addition of Projection.cpp and Projection.hpp to the CLP_S_SEARCH_SOURCES variable is correct and aligns with the PR objectives of adding support for projecting a subset of columns during search.

A few considerations:

  1. Ensure that any necessary include directives for these new files are added in other source files that will use the Projection functionality.
  2. If the Projection feature requires any new dependencies, make sure they are added to the target_link_libraries command for the clp-s target.
  3. Consider updating any relevant documentation or README files to reflect the new Projection feature.
components/core/src/clp_s/CommandLineArguments.hpp (2)

111-112: LGTM: Well-designed getter method for projection columns.

The new get_projection_columns() method is a well-implemented getter. It provides efficient, read-only access to the projection columns using a const reference return type. The method is appropriately marked as const, ensuring it doesn't modify the object's state.


111-112: Verify implementation of parsing and setting projection columns.

The changes to CommandLineArguments class provide a solid foundation for handling projection columns. However, the actual parsing of the --projection command-line option and setting of the m_projection_columns vector is not visible in this file.

To ensure complete implementation:

  1. Verify that the --projection option is properly parsed in the command-line parsing logic (likely in a separate file or method).
  2. Confirm that m_projection_columns is correctly populated with the parsed values.
  3. Check for any validation of the projection column names (e.g., disallowing wildcards, handling case sensitivity) as mentioned in the PR objectives.

Also applies to: 197-197

components/core/src/clp_s/SchemaReader.hpp (3)

14-14: LGTM: Include statement for Projection header

The addition of the include statement for "search/Projection.hpp" is appropriate and necessary for using the search::Projection type in the SchemaReader class.


298-298: LGTM: New member variable for projection

The addition of the m_projection member variable is consistent with the PR objectives and matches the type used in the reset method signature.


Line range hint 1-303: Overall assessment: Changes align with PR objectives

The modifications to the SchemaReader class, including the new include statement, updated reset method signature, and new m_projection member variable, are well-implemented and consistent with the PR objectives to support column projection.

These changes lay the groundwork for implementing the projection feature in the SchemaReader class, allowing for more efficient data retrieval based on user-specified columns.

Ensure that the implementation of the projection functionality in other parts of the codebase (e.g., in the reset method body and other methods that will use m_projection) is consistent with the design introduced here.

components/core/src/clp_s/ArchiveReader.cpp (1)

238-238: LGTM! Verify SchemaReader class update.

The addition of the m_projection parameter to reader.reset() aligns well with the PR objective of supporting column projection during search. This change appears to be a key component in implementing the new --projection command line option.

To ensure consistency across the codebase, please run the following script to verify the SchemaReader class update:

✅ Verification successful

Verified SchemaReader class updates. No issues found.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify SchemaReader class update to handle projections

# Test 1: Check if SchemaReader class has been updated to handle projections
rg --type cpp -g 'SchemaReader.*' 'class SchemaReader' -A 10

# Test 2: Verify if reader.reset() method signature has been updated
rg --type cpp -g 'SchemaReader.*' 'void reset\(' -A 5

# Test 3: Check for any new methods related to projection handling
rg --type cpp -g 'SchemaReader.*' 'projection'

Length of output: 2715

components/core/src/clp_s/SchemaReader.cpp (2)

551-553: Excellent implementation of column projection!

This change effectively implements the projection feature, ensuring that only the columns specified by the user are included in the local tree. This aligns perfectly with the PR objective of allowing users to specify which columns to include in the search results using the --projection option.


551-569: Overall excellent implementation of the projection feature

The changes in this file successfully implement the projection feature as described in the PR objectives. The modifications allow users to specify which columns to include in the search results, and they are well-integrated into the existing code structure.

Key points:

  1. The projection is correctly applied to both ordered and unordered schema elements.
  2. Defensive programming practices have been added to handle potential edge cases.
  3. The changes are consistent with the existing codebase style and structure.

Minor suggestions for improvement have been provided in the individual comments, but these are not critical for the functionality of the feature. Great job on this implementation!

components/core/src/clp_s/CommandLineArguments.cpp (1)

Line range hint 1-1: Verify related changes in other files.

While the addition of the --projection option is implemented correctly here, there are a couple of points to consider:

  1. Ensure that the m_projection_columns member variable is properly declared in the corresponding header file (likely CommandLineArguments.hpp).
  2. Verify that the actual implementation of the projection functionality is present in the appropriate file(s) where the search results are processed and output.

To check for the member variable declaration and related implementations, you can run the following script:

This script will help identify where the member variable is declared and where the projection functionality is implemented.

✅ Verification successful

Verification Successful

The m_projection_columns member variable is properly declared in CommandLineArguments.hpp, and the projection functionality is correctly implemented across the relevant files.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash

# Check for m_projection_columns declaration
echo "Checking for m_projection_columns declaration:"
rg "std::vector<std::string>\s+m_projection_columns;" --type cpp

echo -e "\nChecking for projection-related implementations:"
rg "projection|m_projection_columns" --type cpp -g '!CommandLineArguments.cpp'

Length of output: 3175

components/core/src/clp_s/clp-s.cpp (3)

32-32: Include Projection.hpp to Use Projection Functionality

The inclusion of "search/Projection.hpp" is necessary to utilize the Projection class for handling column projections.


43-43: Add StringUtils to Namespace Scope

The using directive using clp_s::StringUtils; correctly brings StringUtils into scope, allowing for cleaner code when calling its methods.


194-194: Check for Exceptions from ColumnDescriptor::create

Ensure that ColumnDescriptor::create(descriptor_tokens) handles invalid input gracefully and does not throw unexpected exceptions.

Run the following script to verify how ColumnDescriptor::create handles invalid descriptors:

Ensure that any exceptions are properly documented and managed.

✅ Verification successful

Verification Complete: Exceptions Handling in ColumnDescriptor::create Confirmed

The ColumnDescriptor::create(descriptor_tokens) implementation does not contain any throw statements, indicating that it handles invalid input without raising exceptions directly. Additionally, the surrounding code in clp-s.cpp appropriately manages potential exceptions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Inspect ColumnDescriptor::create for exception safety.

# Find the implementation of ColumnDescriptor::create
rg --type cpp 'ColumnDescriptor::create\(' -A 10

# Look for any throw statements or exception handling mechanisms

Length of output: 7727


Script:

#!/bin/bash
# Description: Inspect ColumnDescriptor::create for exception safety.

# Find the implementation files of ColumnDescriptor::create
rg --type cpp 'ColumnDescriptor::create\s*\(' > create_implementations.txt

# Extract file paths from the previous command
FILES=$(rg --type cpp --files-with-matches 'ColumnDescriptor::create\s*\(')

# Search for throw statements within these implementations
for file in $FILES; do
    echo "Checking $file for throw statements..."
    rg 'throw\s+' "$file" -C 2
done

Length of output: 1688

components/core/src/clp_s/search/Projection.hpp Outdated Show resolved Hide resolved
components/core/src/clp_s/search/Projection.hpp Outdated Show resolved Hide resolved
components/core/src/clp_s/search/Projection.hpp Outdated Show resolved Hide resolved
components/core/src/clp_s/search/Projection.hpp Outdated Show resolved Hide resolved
gibber9809 and others added 2 commits September 26, 2024 10:30
Co-authored-by: wraymo <37269683+wraymo@users.noreply.github.com>
@gibber9809 gibber9809 requested a review from wraymo September 26, 2024 14:33
wraymo
wraymo previously approved these changes Oct 4, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
components/core/src/clp_s/CommandLineArguments.cpp (1)

428-435: LGTM! Consider clarifying the projection documentation.

The new options "archive-id" and "projection" are well-implemented and properly integrated into the existing command-line argument structure. The code follows the established patterns and is consistent with the rest of the file.

A minor suggestion for improvement:

Consider adding a brief example of how to use the "projection" option in the documentation string. This could help users understand the feature more quickly. For instance:

 "Project only the given set of columns for matching results. This option must be"
 " specified after all positional options. Values that are objects or structured"
-" arrays are currently unsupported."
+" arrays are currently unsupported. Example: --projection column1 column2 column3"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between c83ebcc and b291d77.

📒 Files selected for processing (2)
  • components/core/src/clp_s/CommandLineArguments.cpp (1 hunks)
  • components/core/src/clp_s/search/Projection.hpp (1 hunks)
🔇 Additional comments (1)
components/core/src/clp_s/search/Projection.hpp (1)

63-66: Function matches_node implementation is correct.

The function correctly determines whether a node should be included based on the projection mode and matching nodes.


std::vector<std::shared_ptr<ColumnDescriptor>> m_selected_columns;
absl::flat_hash_set<int32_t> m_matching_nodes;
ProjectionMode m_projection_mode{ProjectionMode::ReturnAllColumns};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove redundant in-class initialization of m_projection_mode.

Since m_projection_mode is always initialized by the constructor, the in-class initializer is redundant and can be removed to avoid potential confusion.

Apply this diff to remove the redundant initialization:

-    ProjectionMode m_projection_mode{ProjectionMode::ReturnAllColumns};
+    ProjectionMode m_projection_mode;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ProjectionMode m_projection_mode{ProjectionMode::ReturnAllColumns};
ProjectionMode m_projection_mode;

Comment on lines +26 to +31
class OperationFailed : public TraceableException {
public:
// Constructors
OperationFailed(ErrorCode error_code, char const* const filename, int line_number)
: TraceableException(error_code, filename, line_number) {}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider renaming OperationFailed to ProjectionOperationFailed for clarity.

The exception class OperationFailed has a generic name that may lead to confusion if similar exceptions exist elsewhere in the codebase. Renaming it to ProjectionOperationFailed will make its purpose more explicit and improve readability.

Apply this diff to rename the exception class:

-    class OperationFailed : public TraceableException {
+    class ProjectionOperationFailed : public TraceableException {

Remember to update all references to OperationFailed accordingly.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class OperationFailed : public TraceableException {
public:
// Constructors
OperationFailed(ErrorCode error_code, char const* const filename, int line_number)
: TraceableException(error_code, filename, line_number) {}
};
class ProjectionOperationFailed : public TraceableException {
public:
// Constructors
ProjectionOperationFailed(ErrorCode error_code, char const* const filename, int line_number)
: TraceableException(error_code, filename, line_number) {}
};

@kirkrodrigues kirkrodrigues requested a review from wraymo October 4, 2024 22:55
@wraymo wraymo merged commit 7826d73 into y-scope:main Oct 7, 2024
18 checks passed
gibber9809 added a commit to gibber9809/clp that referenced this pull request Oct 25, 2024
…h. (y-scope#510)

Co-authored-by: wraymo <37269683+wraymo@users.noreply.github.com>
Co-authored-by: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com>
jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024
…h. (y-scope#510)

Co-authored-by: wraymo <37269683+wraymo@users.noreply.github.com>
Co-authored-by: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants