Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assess how intrusive the dataset deprecation warnings are #3107

Closed
Tracked by #3123
merelcht opened this issue Oct 2, 2023 · 10 comments · Fixed by #3186
Closed
Tracked by #3123

Assess how intrusive the dataset deprecation warnings are #3107

merelcht opened this issue Oct 2, 2023 · 10 comments · Fixed by #3186
Assignees

Comments

@merelcht
Copy link
Member

merelcht commented Oct 2, 2023

Description

We've added deprecation warning for the dataset renames e.g. CSVDataSet to CSVDataset. It seems that these deprecation warnings appear even if you don't explicitly import/use datasets in your project. These come from kedro.io. This might be annoying to users.

Task

  • Double check when the deprecation warnings show.
  • Suggest a solution
  • Discuss whether it's worth implementing considering these warnings will be removed when 0.19.0 is released.
@merelcht
Copy link
Member Author

When using the latest main and kedro-datasets 1.7.1 users will see the following when running any Kedro commands:

WARNING  /Users/merel_theisen/anaconda3/envs/kedro/lib/python3.9/site-packages/ked warnings.py:109
                             ro/io/__init__.py:6: KedroDeprecationWarning: 'CachedDataSet' has been
                             renamed to 'CachedDataset', and the alias will be removed in Kedro 0.19.0
                               from .cached_dataset import CachedDataSet, CachedDataset

                    WARNING  /Users/merel_theisen/anaconda3/envs/kedro/lib/python3.9/site-packages/ked warnings.py:109
                             ro/io/__init__.py:16: KedroDeprecationWarning: 'LambdaDataSet' has been
                             renamed to 'LambdaDataset', and the alias will be removed in Kedro 0.19.0
                               from .lambda_dataset import LambdaDataSet, LambdaDataset

                    WARNING  /Users/merel_theisen/anaconda3/envs/kedro/lib/python3.9/site-packages/ked warnings.py:109
                             ro/io/__init__.py:17: KedroDeprecationWarning: 'MemoryDataSet' has been
                             renamed to 'MemoryDataset', and the alias will be removed in Kedro 0.19.0
                               from .memory_dataset import MemoryDataSet, MemoryDataset

                    WARNING  /Users/merel_theisen/anaconda3/envs/kedro/lib/python3.9/site-packages/ked warnings.py:109
                             ro/io/__init__.py:18: KedroDeprecationWarning: 'IncrementalDataSet' has
                             been renamed to 'IncrementalDataset', and the alias will be removed in
                             Kedro 0.19.0
                               from .partitioned_dataset import (

                    WARNING  /Users/merel_theisen/anaconda3/envs/kedro/lib/python3.9/site-packages/ked warnings.py:109
                             ro/io/__init__.py:18: KedroDeprecationWarning: 'PartitionedDataSet' has
                             been renamed to 'PartitionedDataset', and the alias will be removed in
                             Kedro 0.19.0
                               from .partitioned_dataset import (

Since these warnings come from core Kedro, they can't change anything in their code to make these warnings go away.
They can set export PYTHONWARNINGS="ignore::DeprecationWarning:kedro" in their run environment to silence the warnings. That would silence any Kedro deprecation warnings, so I would only advice doing this to users if they complain about the deprecation warnings.

I personally don't think it's the effort to change these deprecation warnings, because they'll be gone in 0.19.0 and there is a work around in case users find it super annoying.

@merelcht merelcht self-assigned this Oct 16, 2023
@merelcht
Copy link
Member Author

Let me know what you think of keeping it this way and just suggest export PYTHONWARNINGS="ignore::DeprecationWarning:kedro" if users complain.
@SajidAlamQB @lrcouto @ankatiyar @noklam @astrojuanlu @deepyaman

@noklam
Copy link
Contributor

noklam commented Oct 16, 2023

@merelcht I cannot reproduce this behavior.

Steps that I have taken:

  1. Started GitPod
  2. checkout main and install kedro-datasets==1.7.1 as suggested
  3. Try kedro info, kedro starter, I don't see any warning.

@lrcouto
Copy link
Contributor

lrcouto commented Oct 16, 2023

I got the warnings on kedro info and kedro starter, on GitPod as well. Personally, they don't bother me very much. If there was more of them I think the amount of terminal scrolling that they cause could be a hassle, but as they are, it might be ok to keep them until 0.19.0.

@merelcht
Copy link
Member Author

@merelcht I cannot reproduce this behavior.

Steps that I have taken:

  1. Started GitPod
  2. checkout main and install kedro-datasets==1.7.1 as suggested
  3. Try kedro info, kedro starter, I don't see any warning.

That's odd! I just tried on GitPod as well and I can see them 😅 Are you sure it installed kedro-datasets 1.7.1 properly?

@ankatiyar
Copy link
Contributor

ankatiyar commented Oct 17, 2023

I am okay with keeping it but I am of the opinion that we should silence the warnings coming from these imports within Kedro if we can before the release. It shows up with most of the CLI commands, even kedro info for example.

While doing the tutorial during the bootcamp, I realised that it must look very strange to a beginner user. There's at least five even after you fix everything you can with your project. And someone who is not super familiar with the Kedro code base will be confused about what LambdaDataset or CachedDataset mean.

There will also be people who will stick with the 0.18.x series for a while, and 0.18.14 being the last release of the series, it's not a great experience to see warnings that you can't do anything about, unless you silence all warnings or upgrade to 0.19.0.

I also think if you're using starters created with older versions of 0.18.x series, there'll be even more warnings - deprecation of config loaders, other DataSet->Dataset mentioned in your project's catalog. People might end up ignoring all of them instead of fixing the ones they should to make their transition to 0.19.0 smoother.

@merelcht
Copy link
Member Author

I am okay with keeping it but I am of the opinion that we should silence the warnings coming from these imports within Kedro if we can before the release. It shows up with most of the CLI commands, even kedro info for example.

I think the difficulty here is that there might be people who are using the core datasets with the old name directly, so if we don't show these warnings at all we are essentially not flagging the deprecations to users.

While doing the tutorial during the bootcamp, I realised that it must look very strange to a beginner user. There's at least five even after you fix everything you can with your project. And someone who is not super familiar with the Kedro code base will be confused about what LambdaDataset or CachedDataset mean.

There will also be people who will stick with the 0.18.x series for a while, and 0.18.14 being the last release of the series, it's not a great experience to see warnings that you can't do anything about, unless you silence all warnings or upgrade to 0.19.0.

I agree with the experience not being super nice for beginner users, but there is definitely something they can do which is export PYTHONWARNINGS="ignore::DeprecationWarning:kedro" That is also the case for anyone using 0.18.14.

I also think if you're using starters created with older versions of 0.18.x series, there'll be even more warnings - deprecation of config loaders, other DataSet->Dataset mentioned in your project's catalog. People might end up ignoring all of them instead of fixing the ones they should to make their transition to 0.19.0 smoother.

True, but those deprecation warnings can be fixed by just using the new OmegaConfigLoader and changing the datasets names. At the end of the day we can't force users to update their code and if they decide to just ignore everything which then makes it harder to migrate to 0.19.0 that isn't our fault. IMO it's worse to not show deprecations because then users don't even know things will be changing.

@SajidAlamQB
Copy link
Contributor

SajidAlamQB commented Oct 17, 2023

I'm okay with keeping the deprecation warnings as they are for now. It's going to be a non-issue once we roll out 0.19.0, so I don't think its a big issue for new users. I think for the users sticking with the 0.18.x series, while it might not be the best experience, we just need to make sure we communicate the workaround clearly to them.

@ankatiyar
Copy link
Contributor

Fair point, I don't really mind keeping things as they are.
I was specifically talking about the imports in kedro/io/__init__.py which are emanating these warnings. If someone is using the core datasets directly, they definitely should be visible.

A straightforward solution is in kedro/io/__init__.py

with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    from .cached_dataset import CachedDataSet
    from .lambda_dataset import LambdaDataSet
    from .partitioned_dataset import IncrementalDataSet, PartitionedDataSet

@merelcht
Copy link
Member Author

Thanks @ankatiyar, I think that's a good suggestion! I'll create a PR for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants