Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add data_type and nullable to StructField hash (#2045) #2190

Merged
merged 6 commits into from
Feb 24, 2024

Conversation

sonhmai
Copy link
Contributor

@sonhmai sonhmai commented Feb 18, 2024

Description

Correct hashing of StructField and add tests.

  • Current: Two StructFields with the same name but different data_type or different nullable are considered to hash equivalently.
  • To be: name, data_type and nullability should match for a field to be considered equivalent.

Related Issue(s)

closes #2045

@sonhmai
Copy link
Contributor Author

sonhmai commented Feb 18, 2024

there seems to be an error unrelated to this PR

FAILED tests/test_convert_to_delta.py::test_convert_delta_with_partitioning - _internal.DeltaError: 
Generic error: The schema of partition columns must be provided to convert a Parquet table to a Delta table

@ion-elgreco
Copy link
Collaborator

@sonhmai not so sure about that, these failures weren't there before. Maybe some interaction somewhere else in the code?

@sonhmai
Copy link
Contributor Author

sonhmai commented Feb 21, 2024

@sonhmai not so sure about that, these failures weren't there before. Maybe some interaction somewhere else in the code?

yep. will definitely look into details.

@sonhmai
Copy link
Contributor Author

sonhmai commented Feb 22, 2024

@ion-elgreco fix pushed with local tests passed. can you plz let workflow run? Thanks.

@sonhmai
Copy link
Contributor Author

sonhmai commented Feb 22, 2024

@rtyler @ion-elgreco all checks passed!

@ion-elgreco ion-elgreco enabled auto-merge (squash) February 24, 2024 13:30
Copy link
Collaborator

@ion-elgreco ion-elgreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sonhmai thanks for the fix! 😀

@ion-elgreco ion-elgreco merged commit bcf124c into delta-io:main Feb 24, 2024
20 checks passed
@sonhmai sonhmai deleted the fix/2045-structfield-comparison branch February 25, 2024 01:45
@@ -641,6 +643,7 @@ mod tests {
use super::*;
use serde_json;
use serde_json::json;
use std::hash::DefaultHasher;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incase anyone else runs into this - this raises the MSRV of running the tests to 1.76.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for pointing this out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hash for StructField should consider more than the name
4 participants