Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delta-RS Saved Delta Table not properly ingested into Databricks #2779

Closed
niltecedu opened this issue Aug 15, 2024 · 3 comments
Closed

Delta-RS Saved Delta Table not properly ingested into Databricks #2779

niltecedu opened this issue Aug 15, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@niltecedu
Copy link

Environment

Delta-rs version:
0.18.2

Binding:
Not sure

Environment:

  • Cloud provider:
    Azure Data Lake Gen2

  • OS:
    Windows 11

  • Other:


Bug

What happened:
Ingesting the delta table stored on the azure data lake, made by the deltalake pakage on azure databricks leads to databricks not picking up the delta tables properly; it loads up all of the parquet files instead of reading it as delta table.

What you expected to happen:
Only the latest version of the delta table is ingested

How to reproduce it:
Created a pandas dataframe, write to azure multiple times using overwrrite,

More details:
Creating a ticket with databricks as well to resolve the issue.

@niltecedu niltecedu added the bug Something isn't working label Aug 15, 2024
@ion-elgreco
Copy link
Collaborator

We need more info because this sounds like you are not reading the table correctly with spark

@niltecedu
Copy link
Author

Heya, reading the table via databricks sql, loading it via deltatable package locally with pandas

image

Loading it in databricks sql

image

Loading it in databricks sql with distinct

image

The databricks sql also keeps the older version of the file which different rows

@niltecedu
Copy link
Author

Nevermind, me being a knobhead. I saved to azure data lake with .parquet in the blobname

Fixing it gives me this for ingestion (I am quite new to databricks)

image

Closing this now thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants