Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: modification time in add actions table is wrong #1124

Closed
wjones127 opened this issue Feb 5, 2023 · 2 comments · Fixed by #1133
Closed

Python: modification time in add actions table is wrong #1124

wjones127 opened this issue Feb 5, 2023 · 2 comments · Fixed by #1133
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@wjones127
Copy link
Collaborator

wjones127 commented Feb 5, 2023

Environment

Delta-rs version: 0.7.0 Python

Binding: Python

Environment:

  • Cloud provider:
  • OS: MacOS M1
  • Other:

Bug

What happened:

What you expected to happen:

How to reproduce it:

import pandas as pd
from deltalake import DeltaTable, write_deltalake

example_df = pd.DataFrame({
    "part": ["a", "a", "b", "b"],
    "value": [1, 2, 3, 4]
})

write_deltalake(
    "example_table",
    example_df,
    partition_by=["part"],
    mode="overwrite"
)

table = DeltaTable("example_table")
table.get_add_actions(flatten=True).column("modification_time")

Outputs:

<pyarrow.lib.TimestampArray object at 0x137e994e0>
[
  1970-01-20 09:27:03.636,
  1970-01-20 09:27:03.636
]

More details:

We are likely interpreting the time resolution wrong somewhere along the way.

@wjones127 wjones127 added bug Something isn't working good first issue Good for newcomers labels Feb 5, 2023
@guyrt
Copy link
Contributor

guyrt commented Feb 7, 2023

Can I take this? Probably get a PR in by end of weekend.

Issue is in the data write. Note the precision of add timestamps vs remove and commitinfo timestamps.

{"add":{"modificationTime":1675743351}}
{"add":{"modificationTime":1675743351,}
{"remove":"deletionTimestamp":1675743351371, ... 
{"remove":{"deletionTimestamp":1675743351371, ...
{"commitInfo":{"operationParameters":{"mode":"Overwrite","partitionBy":"[\"part\"]"},"timestamp":1675743351373}}

@wjones127
Copy link
Collaborator Author

Please do!

guyrt pushed a commit to guyrt/delta-rs that referenced this issue Feb 7, 2023
wjones127 pushed a commit that referenced this issue Feb 11, 2023
# Description
Updates timestamp for AddActions to use milliseconds.

# Related Issue(s)
- closes #1124 

# Documentation

\

---------

Co-authored-by: Tommy Guy <riguy@microsoft.com>
chitralverma pushed a commit to chitralverma/delta-rs that referenced this issue Mar 17, 2023
…io#1133)

# Description
Updates timestamp for AddActions to use milliseconds.

# Related Issue(s)
- closes delta-io#1124 

# Documentation

\

---------

Co-authored-by: Tommy Guy <riguy@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants