Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: revert Pydantic custom serializer #435

Merged
merged 6 commits into from
Jul 19, 2024
Merged

fix: revert Pydantic custom serializer #435

merged 6 commits into from
Jul 19, 2024

Conversation

korikuzma
Copy link
Contributor

  • The Pydantic custom serializer resulted in model_dump to only include minimal fields for ga4gh serialization. This caused issues in downstream apps that leverage FastAPI

* Only focused on ga4gh.core + Pydantic models
* The custom serializer resulted in `model_dump` to only include minimal fields for ga4gh serialization. This caused issues in downstream apps that leverage FastAPI
@korikuzma korikuzma added bug Something isn't working priority:high High priority labels Jul 17, 2024
@korikuzma korikuzma self-assigned this Jul 17, 2024
@korikuzma korikuzma requested review from a team as code owners July 17, 2024 17:08
@korikuzma
Copy link
Contributor Author

Oh, I forgot to check __hash__

Copy link
Member

@ahwagner ahwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess... 😞

It makes sense to not muck about with the "as json" return, though ideally we would be able to leverage the Rust code in pydantic core for serialization, and not need to introduce a new dependency.

That said, usability is secondary to performance, so I agree this is an improvement. 👍

Base automatically changed from cleanup to main July 17, 2024 17:25
@korikuzma korikuzma requested a review from ahwagner July 17, 2024 18:47
@korikuzma
Copy link
Contributor Author

@ahwagner I updated the hash. Lmk if this is what you think it should be?

@ahwagner
Copy link
Member

@korikuzma what you wrote is fine. __hash__ is helpful for comparison operations and indexing, but doesn't play a role in any of the GA4GH identifier methods as far as I can remember. That this now reflects the ga4gh_digest method takes a little more compute that what might be strictly necessary, but I doubt it will be meaningful in terms of regular usage. I think we can move forward as is.

@korikuzma
Copy link
Contributor Author

@ahwagner I can revert and use a different hash function, I was just basing off of the current implementation in main. @theferrit32 originally made the issue in #225 , so do you have any thoughts?

@theferrit32
Copy link
Contributor

theferrit32 commented Jul 18, 2024

The __hash__ function is used by python's set type.

@korikuzma I think this is enough for #225 as long as the only types that will be in arrays with uniqueItems: true in the schema are identifiable ones. Actually at the moment I'm not seeing any uniqueItems: true in the schemas. I think previously it was used in the Haplotype and Genotype classes.

@korikuzma
Copy link
Contributor Author

@theferrit32 ok, I will go ahead and merge and close #225

@korikuzma korikuzma merged commit 4b0fa9f into main Jul 19, 2024
6 checks passed
@korikuzma korikuzma deleted the serialize branch July 19, 2024 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:high High priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants