Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeltaTable class name collision #438

Closed
MrPowers opened this issue Sep 20, 2021 · 3 comments
Closed

DeltaTable class name collision #438

MrPowers opened this issue Sep 20, 2021 · 3 comments
Labels
binding/python Issues for the Python package enhancement New feature or request

Comments

@MrPowers
Copy link
Contributor

Description

We're working on building a Dask Delta Lake reader. I was writing an integration test that wrote out a Delta Lake with delta and read it in with delta-rs. The test was behaving unexpectedly because DeltaTable is defined in both repos.

I don't think these libraries will normally be used in conjunction, but think it's wise to avoid name collisions in any case. Also think different name will make StackOverflow questions easier and debugging with Googling easier.

Thanks for making a great lib.

@MrPowers MrPowers added the enhancement New feature or request label Sep 20, 2021
@houqp
Copy link
Member

houqp commented Sep 20, 2021

Thanks @MrPowers for the report. The concern of SEO is certainly valid. I also don't have a good solution in mind. You are right that they aren't intended to be used in conjunction. For the rare case that they need to be used together, we will simply have to rely on python package namespace or import one of the DeltaTable with alias.

Another way to look at this could be viewing DeltaTable as a generic search term like Dataframe and let users search DeltaTable together with project name. For example pyspark DeltaTable or native DeltaTable. There is also another pure python deltalake implementation that also uses DeltaTable as the class name, so even if we change to something else, people will still get confused.

@houqp houqp added the binding/python Issues for the Python package label Sep 20, 2021
@MrPowers
Copy link
Contributor Author

MrPowers commented Oct 5, 2021

@houqp - I'm building a dask-interop project that writes out Delta files with Spark and reads them in with delta-rs. I just created different conda environments as a work-around. Not ideal, but like you mentioned, not a common issue, so think it's fine as-is.

On a happy note, I was able to use the delta-rs API to easily read a Delta table into a Dask DataFrame, great work building this lib!

@MrPowers MrPowers closed this as completed Oct 5, 2021
@houqp
Copy link
Member

houqp commented Oct 5, 2021

Interesting, @MrPowers do you have to create a separate conda env? Shouldn't the DeltaTable class namespaced by the python package?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants