Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend tools selection syntax #3394

Merged
merged 7 commits into from
Dec 6, 2023

Conversation

AhdraMeraliQB
Copy link
Contributor

@AhdraMeraliQB AhdraMeraliQB commented Dec 5, 2023

NOTE: Kedro datasets are moving from kedro.extras.datasets to a separate kedro-datasets package in
kedro-plugins repository. Any changes to the dataset implementations
should be done by opening a pull request in that repository.

Description

Fixes #3340

TODO: If merged after #3331 add docs changes as part of this PR

Simplified (or, well, shortened, at least) the validation regex

Development notes

Here's a breakdown of the regex to make reading this easier:

"^(all|none|(( )*(\\d*|\\d( )?-( )?\\d)(( )*,( )*(\\d*|\\d( )?-( )?\\d))*( )*))$"

Space repetition: ( )* -> any number of space characters

Component A: (\\d*|\\d( )?-( )?\\d) -> several consecutive digits or two digits separated by a hyphen with zero or one spaces on each side

Component B: (( )*,( )*(\\d*|\\d( )?-( )?\\d)) -> (( )*,( )*A) -> a comma surrounded by any number of spaces, followed by component A

Altogether: "^(all|none|(( )*AB*( )*))$"

The input all or none or an instance of A followed by any number of instances of B, with any number of space characters preceding or following them

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Ahdra Merali added 2 commits December 5, 2023 15:55
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
@AhdraMeraliQB
Copy link
Contributor Author

Here's how ChatGPT explains the new regex 🤖 :

Here's a breakdown of the key components:

  • ^ and $: Anchors the regex to the start and end of the string, respectively.

  • all|none: Matches either the string "all" or "none".

  • (( )*(\\d*|\\d( )?-( )?\\d)(( )*,( )*(\\d*|\\d( )?-( )?\\d))*( )*): This part appears to allow for a combination of digits and ranges separated by commas. Let's break it down further:

  • (( )*(\\d*|\\d( )?-( )?\\d): This part allows for either:

    • \d*: Zero or more digits.
    • \d( )?-( )?\d: A digit, optional spaces, a hyphen with optional spaces, and another digit.
  • (( )*,( )*(\\d*|\\d( )?-( )?\\d))*( )*): This allows for zero or more occurrences of a comma followed by the same pattern as described above.

In summary, the regex allows for a combination of "all" or "none" or a sequence of digits and ranges separated by commas. The ranges can be specified as "X-Y", with optional spaces around digits and hyphens.

@DimedS
Copy link
Contributor

DimedS commented Dec 5, 2023

Here's how ChatGPT explains the new regex 🤖 :

I'm also asking it to simplify that new regex without loss in functionality, and it provide that:
"^(all|none|( *\\d+)?( *- *\\d+)?( *, *(\\d+)?( *- *\\d+)?)* *)$"

looks like it make sense

@@ -755,7 +755,7 @@ def _validate_config_file_inputs(config: dict[str, str]):

input_tools = config.get("tools", "none")
tools_validation_config = {
"regex_validator": r"^(all|none|(( )*\d*(,\d*)*(,( )*\d*)*( )*|( )*((\d+-\d+)|(\d+ - \d+))( )*))$",
"regex_validator": r"^(all|none|(( )*(\d*|\d( )?-( )?\d)(( )*,( )*(\d*|\d( )?-( )?\d))*( )*))$",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You wrote a good explanation of the regex in the PR - can you actually include it here? Using re.VERBOSE:

https://docs.python.org/3/library/re.html#re.VERBOSE

@AhdraMeraliQB
Copy link
Contributor Author

AhdraMeraliQB commented Dec 6, 2023

@DimedS

"^(all|none|( *\d+)?( *- *\d+)?( *, *(\d+)?( *- \d+)?) *)$"

I think this would allow an input of -4 or 1,2,-4 though, which we'd want to prohibit,

I'll test "^(all|none|( *\\d+)+( *- *\\d+)?( *, *(\\d+)+( *- *\\d+)?)* *)$"

AhdraMeraliQB and others added 2 commits December 6, 2023 10:31
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
@AhdraMeraliQB AhdraMeraliQB force-pushed the fix/extend-tools-selection-syntax branch from 2f4f4de to cfee3db Compare December 6, 2023 11:26
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Copy link
Contributor

@DimedS DimedS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @AhdraMeraliQB , very clear description, works well!

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍 Great explanation.

@AhdraMeraliQB AhdraMeraliQB merged commit 073d6b8 into develop Dec 6, 2023
30 checks passed
@AhdraMeraliQB AhdraMeraliQB deleted the fix/extend-tools-selection-syntax branch December 6, 2023 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow ranges and individual values in the add-ons selection
5 participants