Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] - Add KedroPipelineError and provide better error message for invalid node/pipeline definition #2734

Closed
wants to merge 3 commits into from

Conversation

noklam
Copy link
Contributor

@noklam noklam commented Jun 27, 2023

NOTE: Kedro datasets are moving from kedro.extras.datasets to a separate kedro-datasets package in
kedro-plugins repository. Any changes to the dataset implementations
should be done by opening a pull request in that repository.

Description

fix #2733

It's easy if the value is int type, because we can do some simple type checking, but if user pass X_train:'50' it's harder to differentiate whether it's a valid dataset.

    for dataset_name in all_inputs_outputs:
        # Check if `dataset_name` is not `str`
        name = _strip_transcoding(dataset_name)
        if name != dataset_name and name in all_inputs_outputs:
            invalid.add(name)

Originally posted by @noklam in #2733 (comment)

Development notes

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
@noklam
Copy link
Contributor Author

noklam commented Jun 27, 2023

In case user pass a string as value instead of a dataset, they will just get a ValueError because the dataset does not exist. It would be nice to provide the same error consistently, but I don't think it's easy to differentiate here.

I consider this is still a win because in this case the error message is clear enough

ValueError: Pipeline input(s) {'2', '3'} not found in the DataCatalog
DataCatalog

@noklam noklam changed the title Add KedroPipelineError and provide better error message for invalid node/pipeline definition [Proposal] - Add KedroPipelineError and provide better error message for invalid node/pipeline definition Jun 28, 2023
@merelcht merelcht closed this May 20, 2024
@merelcht merelcht deleted the feat/better-error-message-pipeline branch May 20, 2024 14:54
@merelcht merelcht restored the feat/better-error-message-pipeline branch May 20, 2024 14:56
@merelcht
Copy link
Member

Closed by accident, sorry!

@astrojuanlu
Copy link
Member

#2733 was already closed in #3715, anything else to do here?

@noklam
Copy link
Contributor Author

noklam commented Jul 22, 2024

Closing in favor of #3715. I think the initial idea was mainly about the str problem though it's more generic than that.

@noklam noklam closed this Jul 22, 2024
@astrojuanlu astrojuanlu deleted the feat/better-error-message-pipeline branch July 28, 2024 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants