Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filesystem argument for reading DeltaTable in Python binding #414

Merged
merged 3 commits into from
Aug 28, 2021

Conversation

fvaleye
Copy link
Collaborator

@fvaleye fvaleye commented Aug 24, 2021

Description

  • Expose filesystem argument for reading DeltaTable with Pyarrow in the Python binding

Related Issue(s)

#392

@fvaleye fvaleye added the binding/python Issues for the Python package label Aug 24, 2021
@fvaleye fvaleye requested a review from houqp August 24, 2021 20:34
@fvaleye fvaleye force-pushed the python/expose-filesystem-pyarrow-read branch 2 times, most recently from ffaa3bb to 7c1b07c Compare August 24, 2021 20:59
@fvaleye fvaleye force-pushed the python/expose-filesystem-pyarrow-read branch from 7c1b07c to b9dda8d Compare August 24, 2021 21:47
houqp
houqp previously approved these changes Aug 28, 2021
Copy link
Member

@houqp houqp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like there is a code conflict

python/deltalake/table.py Outdated Show resolved Hide resolved
@fvaleye fvaleye force-pushed the python/expose-filesystem-pyarrow-read branch from 2816ab8 to 7d20d7d Compare August 28, 2021 07:55
@fvaleye fvaleye enabled auto-merge (squash) August 28, 2021 07:57
@fvaleye fvaleye requested a review from houqp August 28, 2021 07:57
@fvaleye fvaleye merged commit 385fb64 into delta-io:main Aug 28, 2021
@mattc-eostar
Copy link

I am not sure that this addresses the specific issue with how ADLS needs a container prepended to the file names for use within the file system object as described the issue related to this (#392).

@prasadvaze
Copy link

@mattc-eostar is correct. We still need to prepend container-name to the file names. So the issue is still open. @fvaleye

In the meantime tried using blobfuse instead of adlfs to connect and mount ADLS on ubuntu. The performance is much better (2 sec versus 6 sec to query ~50k rows delta table.) And not required to use to_pyarrow_dataset( )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants