-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace and deprecate DataSet
use in class names
#2500
Conversation
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
@deepyaman I'm not sure it actually makes sense to the rename on the Kedro framework side. It's a breaking change and we're removing all datasets from here when releasing |
@merelcht Won't the datasets in |
Ignore me, I should've had my morning coffee before commenting 😂 We should rename all of them, but it will be a breaking change so maybe it's better to wait until we start work on |
If we do a deprecation warning what will be users see and how often? It might be annoying if there's nothing they can do about it.. |
If this functionality were to be released, they would get a warning if they used |
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
…into rename-data-set
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
c0ebba2
to
2d41aec
Compare
You are truly the patron saint of consistency @deepyaman and I'm happy to approve this change because it seems wholesale and straightforward. However, I'm not able to speak for @merelcht and @idanov in terms of whether it's the right point to merge this change or the implications thereof. But if they're happy with the intent, I'm happy with the execution. |
kedro/utils.py
Outdated
if alias is not None: | ||
warn( | ||
f"{mcs.__name__} has been renamed to {alias.__name__}, and the " | ||
f"alias will be removed Kedro 0.19.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"alias will be removed Kedro 0.19.0", | |
f"alias will be removed in Kedro 0.19.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks! I fixed it in a separate place but forgot to add it back in here. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for persisting with this @deepyaman ! I'm happy for this to get merged ⭐
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's OK with @merelcht it's a 👍 from me too.
RTD failing with:
|
* Replace and deprecate `DataSet` use in class names * Replace another format string with an f-string * Perform deprecations for cached, lambda, and partitioned datasets * Deprecated `MemoryDataSet` in favor of `MemoryDataset` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Fix keyword argument to specify metaclass on `CachedDataSet` * Fix reference to `PartitionedDataset` * Keep `AbstractDataSet` subscriptable * Update __init__.py files, __all__ definitions, etc * Warn of impending Kedro 0.19 (not abstract future) Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> * Update `VideoDataSet` to `VideoDataset` (and refs) * Add missing `kedro.utils.DeprecatedClassMeta` imps * Change deprecated references to `AbstractDataSet` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Warn of impending Kedro 0.19 (not abstract future) * Rename `pandas.CSVDataSet` to `pandas.CSVDataset` * Fix some pylint errors and blacken code * Update `dask.ParquetDataSet` * Undo changes for `VideoDataSet`, inherit from new base * Undo changes to `APIDataSet`, inherit from new base * Fix some imports and missed references * Undo changes to `BioSequenceDataSet`, inherit from new base * Undo changes to Dask and Pandas datasets, inherit from new bases * Remove the `AbstractDataset` and `AbstractVersionedDataset` alias, update `kedro.io.core` * Undo changes in `kedro/extras/datasets` * Update branch * Change `DataSetError` to `DatasetError` * Remove deprecated aliases for Abstract*DataSet * Change `DataSetError` to `DatasetError` in tests/ * Change DataSet*Error to Dataset*Error in tests/ * Fix references to DataSet in a lot of tests * Change `CSVDataset` back to `CSBVDataSet` * Rename core datasets used across `tests` directory * Fix "Saving 'None' to a 'DataSet' is not allowed." messages * Fix `test_http_filesystem_no_versioning` everywhere * Fix removal of "data" * Deprecate `_SharedMemoryDataSet` in favor of `_SharedMemoryDataset` * Fix tests/pipeline/test_pipeline_from_missing.py * Fix list datasets test * Change patched IncrementalDataSet to IncrementalDataset * Fix default checkpoint dataset * Fix data catalog tests * Fix error message * Use `MemoryDataset`, not `MemoryDataSet`, by default * Use `MemoryDataset`, not `MemoryDataSet`, for missing datasets in data catalog * Rename DefaultDataSet key to DefaultDataset * Change `LambdaDataSet` to `LambdaDataset` in `test_node_run.py` * Update error message--but should I? * Update error message--but should I? * Update error message in kedro/io/core.py--but should I? * Update RELEASE.md * Fix remaining tests * Fix lint issues * Align capitalization * Add `DeprecatedClassMeta` tests from StackOverflow * Blacken kedro/utils.py * Ignore "No value for argument 'subclass' in unbound method call" * Rename 'foo' to 'value' to satisfy linter * Disable pylint messages on deprecated class definitions * Blacken kedro/utils.py * Wrap `kedro.io.core` to fix error deprecation * Simplify deprecation of error names to try to fix docs * Undo attempt to make docs pass Revert "Simplify deprecation of error names to try to fix docs" This reverts commit e9294be. Revert "Wrap `kedro.io.core` to fix error deprecation" This reverts commit db9b7ac. * Replace `DataSetError` with `DatasetError` in test * Add missing "in" to a `DeprecationWarning` message * Add "Dataset" versions of errors to `kedro.io` doc * Add updated "Dataset" names to `kedro.io.rst` and sort the entries * Add `_SharedMemoryDataset` to type targets in conf --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
* Replace and deprecate `DataSet` use in class names * Replace another format string with an f-string * Perform deprecations for cached, lambda, and partitioned datasets * Deprecated `MemoryDataSet` in favor of `MemoryDataset` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Fix keyword argument to specify metaclass on `CachedDataSet` * Fix reference to `PartitionedDataset` * Keep `AbstractDataSet` subscriptable * Update __init__.py files, __all__ definitions, etc * Warn of impending Kedro 0.19 (not abstract future) Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> * Update `VideoDataSet` to `VideoDataset` (and refs) * Add missing `kedro.utils.DeprecatedClassMeta` imps * Change deprecated references to `AbstractDataSet` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Warn of impending Kedro 0.19 (not abstract future) * Rename `pandas.CSVDataSet` to `pandas.CSVDataset` * Fix some pylint errors and blacken code * Update `dask.ParquetDataSet` * Undo changes for `VideoDataSet`, inherit from new base * Undo changes to `APIDataSet`, inherit from new base * Fix some imports and missed references * Undo changes to `BioSequenceDataSet`, inherit from new base * Undo changes to Dask and Pandas datasets, inherit from new bases * Remove the `AbstractDataset` and `AbstractVersionedDataset` alias, update `kedro.io.core` * Undo changes in `kedro/extras/datasets` * Update branch * Change `DataSetError` to `DatasetError` * Remove deprecated aliases for Abstract*DataSet * Change `DataSetError` to `DatasetError` in tests/ * Change DataSet*Error to Dataset*Error in tests/ * Fix references to DataSet in a lot of tests * Change `CSVDataset` back to `CSBVDataSet` * Rename core datasets used across `tests` directory * Fix "Saving 'None' to a 'DataSet' is not allowed." messages * Fix `test_http_filesystem_no_versioning` everywhere * Fix removal of "data" * Deprecate `_SharedMemoryDataSet` in favor of `_SharedMemoryDataset` * Fix tests/pipeline/test_pipeline_from_missing.py * Fix list datasets test * Change patched IncrementalDataSet to IncrementalDataset * Fix default checkpoint dataset * Fix data catalog tests * Fix error message * Use `MemoryDataset`, not `MemoryDataSet`, by default * Use `MemoryDataset`, not `MemoryDataSet`, for missing datasets in data catalog * Rename DefaultDataSet key to DefaultDataset * Change `LambdaDataSet` to `LambdaDataset` in `test_node_run.py` * Update error message--but should I? * Update error message--but should I? * Update error message in kedro/io/core.py--but should I? * Update RELEASE.md * Fix remaining tests * Fix lint issues * Align capitalization * Add `DeprecatedClassMeta` tests from StackOverflow * Blacken kedro/utils.py * Ignore "No value for argument 'subclass' in unbound method call" * Rename 'foo' to 'value' to satisfy linter * Disable pylint messages on deprecated class definitions * Blacken kedro/utils.py * Wrap `kedro.io.core` to fix error deprecation * Simplify deprecation of error names to try to fix docs * Undo attempt to make docs pass Revert "Simplify deprecation of error names to try to fix docs" This reverts commit e9294be. Revert "Wrap `kedro.io.core` to fix error deprecation" This reverts commit db9b7ac. * Replace `DataSetError` with `DatasetError` in test * Add missing "in" to a `DeprecationWarning` message * Add "Dataset" versions of errors to `kedro.io` doc * Add updated "Dataset" names to `kedro.io.rst` and sort the entries * Add `_SharedMemoryDataset` to type targets in conf --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
* Replace and deprecate `DataSet` use in class names * Replace another format string with an f-string * Perform deprecations for cached, lambda, and partitioned datasets * Deprecated `MemoryDataSet` in favor of `MemoryDataset` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Fix keyword argument to specify metaclass on `CachedDataSet` * Fix reference to `PartitionedDataset` * Keep `AbstractDataSet` subscriptable * Update __init__.py files, __all__ definitions, etc * Warn of impending Kedro 0.19 (not abstract future) Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> * Update `VideoDataSet` to `VideoDataset` (and refs) * Add missing `kedro.utils.DeprecatedClassMeta` imps * Change deprecated references to `AbstractDataSet` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Warn of impending Kedro 0.19 (not abstract future) * Rename `pandas.CSVDataSet` to `pandas.CSVDataset` * Fix some pylint errors and blacken code * Update `dask.ParquetDataSet` * Undo changes for `VideoDataSet`, inherit from new base * Undo changes to `APIDataSet`, inherit from new base * Fix some imports and missed references * Undo changes to `BioSequenceDataSet`, inherit from new base * Undo changes to Dask and Pandas datasets, inherit from new bases * Remove the `AbstractDataset` and `AbstractVersionedDataset` alias, update `kedro.io.core` * Undo changes in `kedro/extras/datasets` * Update branch * Change `DataSetError` to `DatasetError` * Remove deprecated aliases for Abstract*DataSet * Change `DataSetError` to `DatasetError` in tests/ * Change DataSet*Error to Dataset*Error in tests/ * Fix references to DataSet in a lot of tests * Change `CSVDataset` back to `CSBVDataSet` * Rename core datasets used across `tests` directory * Fix "Saving 'None' to a 'DataSet' is not allowed." messages * Fix `test_http_filesystem_no_versioning` everywhere * Fix removal of "data" * Deprecate `_SharedMemoryDataSet` in favor of `_SharedMemoryDataset` * Fix tests/pipeline/test_pipeline_from_missing.py * Fix list datasets test * Change patched IncrementalDataSet to IncrementalDataset * Fix default checkpoint dataset * Fix data catalog tests * Fix error message * Use `MemoryDataset`, not `MemoryDataSet`, by default * Use `MemoryDataset`, not `MemoryDataSet`, for missing datasets in data catalog * Rename DefaultDataSet key to DefaultDataset * Change `LambdaDataSet` to `LambdaDataset` in `test_node_run.py` * Update error message--but should I? * Update error message--but should I? * Update error message in kedro/io/core.py--but should I? * Update RELEASE.md * Fix remaining tests * Fix lint issues * Align capitalization * Add `DeprecatedClassMeta` tests from StackOverflow * Blacken kedro/utils.py * Ignore "No value for argument 'subclass' in unbound method call" * Rename 'foo' to 'value' to satisfy linter * Disable pylint messages on deprecated class definitions * Blacken kedro/utils.py * Wrap `kedro.io.core` to fix error deprecation * Simplify deprecation of error names to try to fix docs * Undo attempt to make docs pass Revert "Simplify deprecation of error names to try to fix docs" This reverts commit e9294be. Revert "Wrap `kedro.io.core` to fix error deprecation" This reverts commit db9b7ac. * Replace `DataSetError` with `DatasetError` in test * Add missing "in" to a `DeprecationWarning` message * Add "Dataset" versions of errors to `kedro.io` doc * Add updated "Dataset" names to `kedro.io.rst` and sort the entries * Add `_SharedMemoryDataset` to type targets in conf --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
* Replace and deprecate `DataSet` use in class names (#2500) * Replace and deprecate `DataSet` use in class names * Replace another format string with an f-string * Perform deprecations for cached, lambda, and partitioned datasets * Deprecated `MemoryDataSet` in favor of `MemoryDataset` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Fix keyword argument to specify metaclass on `CachedDataSet` * Fix reference to `PartitionedDataset` * Keep `AbstractDataSet` subscriptable * Update __init__.py files, __all__ definitions, etc * Warn of impending Kedro 0.19 (not abstract future) Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> * Update `VideoDataSet` to `VideoDataset` (and refs) * Add missing `kedro.utils.DeprecatedClassMeta` imps * Change deprecated references to `AbstractDataSet` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Warn of impending Kedro 0.19 (not abstract future) * Rename `pandas.CSVDataSet` to `pandas.CSVDataset` * Fix some pylint errors and blacken code * Update `dask.ParquetDataSet` * Undo changes for `VideoDataSet`, inherit from new base * Undo changes to `APIDataSet`, inherit from new base * Fix some imports and missed references * Undo changes to `BioSequenceDataSet`, inherit from new base * Undo changes to Dask and Pandas datasets, inherit from new bases * Remove the `AbstractDataset` and `AbstractVersionedDataset` alias, update `kedro.io.core` * Undo changes in `kedro/extras/datasets` * Update branch * Change `DataSetError` to `DatasetError` * Remove deprecated aliases for Abstract*DataSet * Change `DataSetError` to `DatasetError` in tests/ * Change DataSet*Error to Dataset*Error in tests/ * Fix references to DataSet in a lot of tests * Change `CSVDataset` back to `CSBVDataSet` * Rename core datasets used across `tests` directory * Fix "Saving 'None' to a 'DataSet' is not allowed." messages * Fix `test_http_filesystem_no_versioning` everywhere * Fix removal of "data" * Deprecate `_SharedMemoryDataSet` in favor of `_SharedMemoryDataset` * Fix tests/pipeline/test_pipeline_from_missing.py * Fix list datasets test * Change patched IncrementalDataSet to IncrementalDataset * Fix default checkpoint dataset * Fix data catalog tests * Fix error message * Use `MemoryDataset`, not `MemoryDataSet`, by default * Use `MemoryDataset`, not `MemoryDataSet`, for missing datasets in data catalog * Rename DefaultDataSet key to DefaultDataset * Change `LambdaDataSet` to `LambdaDataset` in `test_node_run.py` * Update error message--but should I? * Update error message--but should I? * Update error message in kedro/io/core.py--but should I? * Update RELEASE.md * Fix remaining tests * Fix lint issues * Align capitalization * Add `DeprecatedClassMeta` tests from StackOverflow * Blacken kedro/utils.py * Ignore "No value for argument 'subclass' in unbound method call" * Rename 'foo' to 'value' to satisfy linter * Disable pylint messages on deprecated class definitions * Blacken kedro/utils.py * Wrap `kedro.io.core` to fix error deprecation * Simplify deprecation of error names to try to fix docs * Undo attempt to make docs pass Revert "Simplify deprecation of error names to try to fix docs" This reverts commit e9294be. Revert "Wrap `kedro.io.core` to fix error deprecation" This reverts commit db9b7ac. * Replace `DataSetError` with `DatasetError` in test * Add missing "in" to a `DeprecationWarning` message * Add "Dataset" versions of errors to `kedro.io` doc * Add updated "Dataset" names to `kedro.io.rst` and sort the entries * Add `_SharedMemoryDataset` to type targets in conf --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Update `DataSetError` to `DatasetError` in test_api_dataset.py --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Description
Deprecate core "DataSet" names and warn about the impending rename to "Dataset".
Once 0.19.0 is released, the deprecated aliases (and even the utils, if so desired) can be removed.
Development notes
Checklist
RELEASE.md
file