Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the kedro new validation logic #3284

Closed
noklam opened this issue Nov 7, 2023 · 3 comments
Closed

Simplify the kedro new validation logic #3284

noklam opened this issue Nov 7, 2023 · 3 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@noklam
Copy link
Contributor

noklam commented Nov 7, 2023

Description

Currently, there are multiple ways to validate the inputs

  1. Cookiecutter regex
  2. _validate_config_file_inputs (also regex based)
  3. _validate_selection inside _parse_add_ons_input

They are only used for path of the flow. There are 4 different way to create a project
a. kedro new - interactive prompt
b. kedro new --addon
c. kedro new --config - take a YAML file
(d). kedro new --starter, for example prompts.yml share the regex_validator for addon and starters.

You can find more details of the flow below.

The problem here is:

  • _validate_config_file_inputs is almost exactly the same as prompts.yml, which is only used for (c).
  • regex is not flexible, because in some case it is much easier to validate after parsing. i.e. how do you write a regex for 1,2,3,4,5 | none | all | 1-5? It is much easier if you parse it first - then we can check if the selection exist or not
  • Hard to maintain, everytime there are changes you need to keep all these logic in sync. We have run into a lot of edge case and https://github.com/kedro-org/kedro/pull/3263/files is a good example. We just want to validate a single input but it requires

In addition, we should also move the logic out to a addon.py, it will be more obvious which logic are shared between "starter" and "addon", right now they are blurred in a single starter.py.

Extra explanation

@AhdraMeraliQB
Hope this has more clarity:
Path 1:
kedro new -> starters.py -> add ons flow with prompts -> cookiecutter validation (regex) [1] -> starters.py -> parse add-ons input[1.5] -> validate_range[1.5] -> validate_selection [2] -> add-ons magic happens
Path 2:
kedro new --config=new_project_config.yml -> starter.py -> check config has necessary prompt info: validate_config_against_prompts [3] -> check inputs in the config file are valid: validate_config_file_inputs [4] -> add ons magic happens
[1] - pre-existing flow, used with new kedro project name
[1.5] - new addition to validate add-ons as regex alone is insufficient
[2] - new addition to validate add-ons as regex alone is insufficient
[3] - preexisting flow
[4] - added recently but necessary with or without add-ons, as all config inputs are subject to some restrictions that were not previously checked and would cause unexpected behaviour (e.g. having special characters in project names)
[4] applies the same regex constraints as [1] as it's a different input path, with a call to [1.5] and [2]

Note: this isn't really pressing at the moment, if we end up expanding the scope of this work we may prioritise this.

Context

Possible Implementation

Possible Alternatives

@noklam noklam added the Issue: Feature Request New feature or improvement to existing feature label Nov 7, 2023
@AhdraMeraliQB
Copy link
Contributor

@DimedS can we close this after #3429 ?

@DimedS
Copy link
Contributor

DimedS commented Dec 20, 2023

@DimedS can we close this after #3429 ?

@AhdraMeraliQB thank you for mention that, looks like we still have some of described problems, regex validation still duplicates in Promps.yml and inside of our codebase. It could be done better I believe, but in my opinion right now it's acceptable. Let's ask @noklam also, it was his ticket.

@noklam
Copy link
Contributor Author

noklam commented Jan 11, 2024

Closing this as it is already outdated and mostly addressed.

@noklam noklam closed this as completed Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
Archived in project
Development

No branches or pull requests

3 participants