Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datasets): Added the ExternalTableDataset for Databricks #827

Open
wants to merge 40 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
fd0de72
added a base table implementation
MinuraPunchihewa Aug 26, 2024
0392aba
added a base table dataset implementation
MinuraPunchihewa Aug 26, 2024
38c470f
renamed the module with the base classes
MinuraPunchihewa Aug 26, 2024
d424829
refactored ManagedTable using BaseTable
MinuraPunchihewa Aug 26, 2024
7091986
refactored ManagedTableDataset using BaseTableDataset
MinuraPunchihewa Aug 26, 2024
081d5d1
removed primary_key attr from BaseTable
MinuraPunchihewa Aug 26, 2024
e2cd580
updated the format attrs of ManagedTable
MinuraPunchihewa Aug 26, 2024
33814be
implemented the _load() method of ManagedTableDataset
MinuraPunchihewa Aug 26, 2024
6f91dc8
removed unnecessary imports
MinuraPunchihewa Aug 26, 2024
410cf4a
reorganized the attrs of BaseTable
MinuraPunchihewa Aug 27, 2024
1bf0b8b
implemented create_table() in ManagedTableDataset
MinuraPunchihewa Aug 27, 2024
ee7e073
added the version attr to BaseTableDataset and updated load()
MinuraPunchihewa Aug 27, 2024
d87f4ed
updated the base and managed datasets with all attrs
MinuraPunchihewa Aug 29, 2024
a4feeff
updated the supported formats
MinuraPunchihewa Aug 29, 2024
0db5639
added external table and external table dataset implementations
MinuraPunchihewa Aug 29, 2024
e19eba4
added a val func to check for format when using upsert mode
MinuraPunchihewa Aug 29, 2024
2c46978
imported the ExternalTableDataset into the main pkg
MinuraPunchihewa Sep 3, 2024
2777ea1
improved the docstrings in the code
MinuraPunchihewa Sep 3, 2024
6e1b17e
added format to the _describe()
MinuraPunchihewa Sep 3, 2024
039f9e0
updated the save methods to incorporate partition columns
MinuraPunchihewa Sep 4, 2024
fb87133
reverted the default write_mode back to None
MinuraPunchihewa Sep 4, 2024
b7a5c33
extended the _validate_write_mode() func to include formats
MinuraPunchihewa Sep 4, 2024
577dd91
updated the save() logic to work with single or multiple partition cols
MinuraPunchihewa Sep 8, 2024
6e729e3
updated the docstrings for the datasets with missing attrs
MinuraPunchihewa Sep 8, 2024
cbb8514
introduced a location attr for creating ext tables
MinuraPunchihewa Sep 8, 2024
4a2dc6e
updated the save funcs to incorporate the locations attr
MinuraPunchihewa Sep 8, 2024
1734f44
moved the func to check if table exists to BaseTable
MinuraPunchihewa Sep 8, 2024
b03bee8
added a val func to check if location is provided if table does not e…
MinuraPunchihewa Sep 8, 2024
dc3550e
moved the val func for checking if write_mode supported to ExternalTable
MinuraPunchihewa Sep 8, 2024
78eee4d
removed the func for adding options to writer for better readability
MinuraPunchihewa Sep 21, 2024
09a8cb2
added a validation check for overwrites on ext tables
MinuraPunchihewa Sep 21, 2024
7e493c2
implemented the _save_overwrite() func for ext tables
MinuraPunchihewa Sep 21, 2024
6343226
removed mentions of a default write mode
MinuraPunchihewa Sep 21, 2024
ccc60c4
improved the docstrings
MinuraPunchihewa Sep 21, 2024
99202ee
fixed lint issues
MinuraPunchihewa Sep 21, 2024
31f385a
fixed a couple of bugs in the Table classes
MinuraPunchihewa Sep 22, 2024
5016dee
updated the _save_overwrite() logic for ext tables
MinuraPunchihewa Sep 22, 2024
77afc50
renamed the val funcs of the ext tables
MinuraPunchihewa Sep 24, 2024
e7ba0e3
updated _save_overwrite() of ext tables to handle no existing tables
MinuraPunchihewa Sep 24, 2024
849fc31
fixed bug in supporting string partition cols
MinuraPunchihewa Sep 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion kedro-datasets/kedro_datasets/databricks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,12 @@

# https://github.com/pylint-dev/pylint/issues/4300#issuecomment-1043601901
ManagedTableDataset: Any
ExternalTableDataset: Any

__getattr__, __dir__, __all__ = lazy.attach(
__name__,
submod_attrs={"managed_table_dataset": ["ManagedTableDataset"]},
submod_attrs={
"managed_table_dataset": ["ManagedTableDataset"],
"external_table_dataset": ["ExternalTableDataset"],
},
)
Loading
Loading