Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional borsh serialization/deserialization #1366

Closed
wants to merge 1 commit into from

Conversation

Fuuzetsu
Copy link

No description provided.

Copy link

codecov bot commented Nov 24, 2023

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

Comparison is base (5aaf742) 92.99% compared to head (5ea75b3) 92.93%.

Files Patch % Lines
src/datetime/mod.rs 0.00% 1 Missing ⚠️
src/month.rs 0.00% 1 Missing ⚠️
src/naive/date.rs 0.00% 1 Missing ⚠️
src/naive/datetime/mod.rs 0.00% 1 Missing ⚠️
src/naive/isoweek.rs 0.00% 1 Missing ⚠️
src/naive/time/mod.rs 0.00% 1 Missing ⚠️
src/offset/fixed.rs 0.00% 1 Missing ⚠️
src/offset/local/mod.rs 0.00% 1 Missing ⚠️
src/offset/utc.rs 0.00% 1 Missing ⚠️
src/time_delta.rs 0.00% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1366      +/-   ##
==========================================
- Coverage   92.99%   92.93%   -0.07%     
==========================================
  Files          34       34              
  Lines       16637    16648      +11     
==========================================
  Hits        15472    15472              
- Misses       1165     1176      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@djc
Copy link
Member

djc commented Nov 24, 2023

Sorry, I am not interested in supporting yet another (de)serialization protocol at this time.

@Fuuzetsu
Copy link
Author

@djc Do you have an alternative you suggest I do? I see borsh crate itself supports other crate serialisation behind feature flags so that's an option: much more hassle/less efficient because there's not access to any internals of course.

It's a bit sad if that's the way it has to be considering the addition is just (in my eyes) few derives...

@Fuuzetsu
Copy link
Author

@djc Do you have an alternative you suggest I do? I see borsh crate itself supports other crate serialisation behind feature flags so that's an option: much more hassle/less efficient because there's not access to any internals of course.

It's a bit sad if that's the way it has to be considering the addition is just (in my eyes) few derives...

If this is the case, should I open a ticket ala "remove all serialisation support and expose internals in some "internal" module to enable reimplementation by arbitrary crates"? This is a somewhat popular approach in the Haskell world

I'm not trying to be malicious here, just trying to figure out how to proceed.

@djc
Copy link
Member

djc commented Nov 26, 2023

I'm open to discussing changes that would be needed to enable you to implement this support outside the crate. Please make a minimal but comprehensive list.

@Fuuzetsu
Copy link
Author

I'm open to discussing changes that would be needed to enable you to implement this support outside the crate. Please make a minimal but comprehensive list.

So far, I'm seeing:

  • It's inefficient to serialise NaiveDate due to invisible DateImpl. One can round-trip via num_days_from_ce and from_num_days_from_ce_opt to keep the size 4 bytes but it costs bunch of computation.

  • It's very difficult to do anything with IsoWeek, notably creating it. I guess one could try to make some bogus NaiveDate using it with a bogus Weekday, serialise that, then deserialise NaiveDate and get IsoWeek out again. But it's inefficient and also error prone.

  • It's surprisingly difficult to serialise Duration. I think one can cheat and serialise via std::duration::Duration, keeping track if time was negative. Then deserialise as std::duration::Duration, use from_std to get back chrono::Duration and then .neg() it depending on the serialised flag. But it's inefficient and forces some bogus error handling for to_std/from_std.

I think those are the main pain points. There are more types (for example FixedOffset) that currently have rkyv implementation and a borsh implementation in this pull request but are difficult to round-trip outside of the crate: but I think these are seldom serialised anyway because they seem to be missing serde impls as far as I can tell. So I'm ignoring these for now.

I'm not sure what an elegant Rust solution is to expose these things. Maybe a trait that's not imported by default that has methods to round-trip via DateImpl (in case of NaiveDate)?

Sadly, all of these seem more annoying to maintain, not less...

I attach beginnings of a diff for support of chrono directly inside borsh crate, just for posterity.

borsh.diff.txt

@pitdicker
Copy link
Collaborator

Sorry to jump into an old discussion.

So far, I'm seeing:

  • It's inefficient to serialise NaiveDate due to invisible DateImpl. One can round-trip via num_days_from_ce and from_num_days_from_ce_opt to keep the size 4 bytes but it costs bunch of computation.

That is quite inefficient! The closest representation is getting the values with Datelike::year() and Datelike::ordinal. But that still leaves recomputing the year flags after deserialization. Exposing them exposes what is currently an implementation detail.

It does not help for serialization, but #1212 has the goal of removing the DateImpl and Of abstractions from NaiveDate.

  • It's very difficult to do anything with IsoWeek, notably creating it. I guess one could try to make some bogus NaiveDate using it with a bogus Weekday, serialise that, then deserialise NaiveDate and get IsoWeek out again. But it's inefficient and also error prone.

Again, good point. IsoWeek has this comment:

// internal use only. we don't expose the public constructor for `IsoWeek` for now,
// because the year range for the week date and the calendar date do not match and
// it is confusing to have a date that is out of range in one and not in another.
// currently we sidestep this issue by making `IsoWeek` fully dependent of `Datelike`.

I suppose it can get a constructor. Like NaiveDate it also stores year flags that would need to get recomputed on deserialization.

  • It's surprisingly difficult to serialise Duration. I think one can cheat and serialise via std::duration::Duration, keeping track if time was negative. Then deserialise as std::duration::Duration, use from_std to get back chrono::Duration and then .neg() it depending on the serialised flag. But it's inefficient and forces some bogus error handling for to_std/from_std.

We will get Duration::new with #1337 and have recently added Duration::subsec_nanos. So at least Duration should be easy with the next release.

@pitdicker
Copy link
Collaborator

pitdicker commented Jan 30, 2024

It's a bit sad if that's the way it has to be considering the addition is just (in my eyes) few derives...

Supporting serde and rkyv is already proving quite a lot of work to be honest.


Maybe we want to expose a method to get all values encoded in a NaiveDate? Year, ordinal, leap year flag, and the last weekday of the preceding year. Well, only the last value is not yet exposed though other methods.

Then we would only need an unsafe method to recreate the NaiveDate from its four components. Unsafe is appropriate, as correctness of the new DateTime depends on the user keeping the values in sync.

NaiveTime can be serialized and deserialized with Timelike::num_seconds_from_midnight and NaiveTime::from_num_seconds_from_midnight_opt.

NaiveDateTime is cheap to combine and split into NaiveDate and NaiveTime.

DateTime can be deserialized with DateTime::from_naive_utc_and_offset, and serialized with DateTime::naive_utc and DateTime::offset. The offset is a generic and out of our control. But Offset::fix and FixedOffset::east_opt should cover most uses I think.

@pitdicker
Copy link
Collaborator

With borsh as a format for 'security-critical projects' you may want to validate/recalculate the year flags anyway?

Then I think the main missing item is an initializer for IsoWeek. But IsoWeek is not what I would call a core type of chrono. If you would leave out support for it I don't thank it is going to be a problem.

@Fuuzetsu
Copy link
Author

Supporting serde and rkyv is already proving quite a lot of work to be honest.

Is it? At a cursory glance, it looks like just automatic derives behind a feature flag. Same for the changes in this PR. Maybe there's some work I'm not seeing but this actually seems like the least amount of work possible for in-direct support and any other solution seems like more work, not less... At the moment we just use a chrono fork for our borsh experiment branch but it'd be nice if it Just Worked™ of course.

@pitdicker
Copy link
Collaborator

pitdicker commented Jan 31, 2024

Supporting serde and rkyv is already proving quite a lot of work to be honest.

Is it? At a cursory glance, it looks like just automatic derives behind a feature flag.

With our release last week the number of features chrono has to support rkyv increased to 5 (rkyv, kryv-16, rkyv-32, rkyv-64, rkyv-validation). And one of the changes broke our docs.rs build for 0.4.32. So far 8 PRs for rkyv: https://github.com/chronotope/chrono/pulls?q=is%3Apr+rkyv

For serde it seems there always is another format that needs custom (de)serialization to a date or time. We have a whole bunch of issues and PRs.
And on the 0.4 branch we are stuck with supporting rustc-serialize as a third library .

Maybe borsh will prove to be different and really only need 'automatic derives behind a feature flag', but I agree with @djc that at this time we need to use our time differently.

What did you think about (de)serializing dates with year and ordinal values and the other suggested methods? Serializing is then just a couple of bit-wise masks and shifts.

@djc
Copy link
Member

djc commented Jan 31, 2024

Going to close this to make it clear that direct inclusion of borsh support is not going to happen at this time.

Happy to discuss more about extra API that we might provide to make it easier to do this out of crate, though!

@djc djc closed this Jan 31, 2024
@Fuuzetsu
Copy link
Author

What did you think about (de)serializing dates with year and ordinal values and the other suggested methods? Serializing is then just a couple of bit-wise masks and shifts.

Sure, that's certainly better than the status quo. Though presumably the traits will need to go into borsh directly behind a feature flag or something, which might be OK, I think few crates are supported like that already.

@pitdicker
Copy link
Collaborator

Fell free to cc me on a PR to borsh.

@djc
Copy link
Member

djc commented Feb 1, 2024

Sure, that's certainly better than the status quo. Though presumably the traits will need to go into borsh directly behind a feature flag or something, which might be OK, I think few crates are supported like that already.

Another common option is an integration crate like borsh-chrono. For example, this is what we do with Askama and also for things like tokio-postgres-rustls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants