Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17567: Improve Stream CLI implementation #2872

Merged
merged 16 commits into from
Nov 22, 2024
Merged

Conversation

epugh
Copy link
Contributor

@epugh epugh commented Nov 19, 2024

https://issues.apache.org/jira/browse/SOLR-17567

This is to improve StreamTool based on some assessments by @malliaridis!

  • Use of raw strings in Option#getOptionValue()
  • Overlapping options (-f and -e)
  • Multiple type suppressions that can be avoided
  • A bug that causes specific bats tests to fail. done in seperate PR by @malliaridis !
  • StreamTool.LocalCatStream overrides CatStream btu does not properly set commaDelimitedFilepaths (this may be a bug)
  • Obsolete LetStream#getLetParams() was introduced
  • Function StreamTool.constructStream seems obsolete and occurrences can be replaced with implementation
  • Implementation of StreamTool.readExpression does not take leading spaces infront of comment lines into account
  • Implementation of StreamTool.readExpression does not support */ at the end of line

@github-actions github-actions bot added documentation Improvements or additions to documentation client:solrj tests labels Nov 19, 2024
@epugh
Copy link
Contributor Author

epugh commented Nov 19, 2024

@malliaridis I've marked off the items I took care of. The remaining ones I need more detail to more forward on.

Regardless, it's nice to get a bit more polish!

@epugh
Copy link
Contributor Author

epugh commented Nov 19, 2024

I've also polished up the docs a bit.

@epugh
Copy link
Contributor Author

epugh commented Nov 19, 2024

Finally! All the junit tests related to StreamTool are passing!

Copy link
Contributor

@malliaridis malliaridis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, did not have SolrOnHdfsSnapshotsTool under my radar. Did you find it throught the forbidden-api rules?

Additionally to the forbidden-api rules I have commented below, I would probably add one small "feature" to complete the comments "parsing", that would be trailing line comments like in

search(
collection,
q=*:* # this is a trailing comment
)

// or like in 

/*search(
collection,
q=*:*
)*/

It is hard to know all edge-cases in advance when it comes to comment stripping.

And thanks for responding so quick to the update requests. :D

org.apache.commons.cli.CommandLine#hasOption(java.lang.String)
org.apache.commons.cli.CommandLine#getOptionValue(java.lang.String)
org.apache.commons.cli.CommandLine#getOptionValue(java.lang.String, java.lang.String)
org.apache.commons.cli.CommandLine#getParsedOptionValue(java.lang.String, java.lang.Object)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would extend the following as well (from CommandLine):

org.apache.commons.cli.CommandLine#hasOption(char)
org.apache.commons.cli.CommandLine#getOptionValue(char)
org.apache.commons.cli.CommandLine#getOptionValue(char, java.lang.String)
org.apache.commons.cli.CommandLine#getOptionValue(char, Supplier<String>)
org.apache.commons.cli.CommandLine#getOptionValues(char)
org.apache.commons.cli.CommandLine#getOptionValues(java.lang.String)
org.apache.commons.cli.CommandLine#getParsedOptionValue(char)
org.apache.commons.cli.CommandLine#getParsedOptionValue(char, Supplier<T>)
org.apache.commons.cli.CommandLine#getParsedOptionValue(char, java.lang.Object)
org.apache.commons.cli.CommandLine#getParsedOptionValue(java.lang.String)
org.apache.commons.cli.CommandLine#getParsedOptionValue(String, Supplier<T>)

I haven't written any rules before, so I am not sure if the above have the correct format (especially Supplier<T> is likely not correct).

@epugh
Copy link
Contributor Author

epugh commented Nov 19, 2024

Wow, did not have SolrOnHdfsSnapshotsTool under my radar. Did you find it throught the forbidden-api rules?

I did!!!

@malliaridis
Copy link
Contributor

About the remaining TODOs:

Multiple type suppressions that can be avoided

Suppressions like @SuppressWarnings({"rawtypes"}) for classes like List can usually be fixed by simply using List<?> or using the known generic type if any present.

StreamTool.LocalCatStream overrides CatStream btu does not properly set commaDelimitedFilepaths (this may be a bug)

In CatStream commaDelimitedFilepaths is a class property that is also used in CatStream#toExpression, which is called by other classes. By passing passing the property as parameter in validateAndSetFilepathsInSandbox and overriding the method does not update the class property, resulting to potential inconsistent behavior / values in other classes.

Instead, it is possible to simply remove the property from the parameters of validateAndSetFilepathsInSandbox (like it was before), and make the property commaDelimitedFilepaths visible (protected modifier). That way, overriding validateAndSetFilepathsInSandbox will be possible without changing the method signature.

If the description is a bit irritating, I can create a separate PR for that. :)

@epugh
Copy link
Contributor Author

epugh commented Nov 20, 2024

Okay, this I think is almost ready to merge!

@epugh
Copy link
Contributor Author

epugh commented Nov 22, 2024

Since this is a straight up follow on from the introduction of the StreamTool in main I am not adding CHANGES.txt entry.

@epugh epugh merged commit 621165a into apache:main Nov 22, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants