feat: add 'millisecond' option to ser_json_timedelta config parameter #1427

ollz272 · 2024-08-30T07:38:26Z

Change Summary

Adds a millisecond option to ser_json_timedelta, which returns the number of milliseconds in the timedelta.

Note a corresponding PR will be needed in pydantic

Related issue number

pydantic/pydantic#10256

Checklist

Unit tests for the changes exist
Documentation reflects the changes where applicable
Pydantic tests pass with this pydantic-core (except for expected changes)
My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @sydney-runkle

ollz272 · 2024-08-30T07:41:24Z

src/serializers/config.rs

+                // convert to int via a py timedelta not duration since we know this this case the input would have
+                // been a py timedelta
+                let py_timedelta = either_delta.try_into_py(py)?;
+                let seconds: f64 = Self::total_seconds(&py_timedelta)?.extract()?;


There might be a better way to do this which is alluding me, maybe we could do the multiplication in python? 🤷🏻‍♂️

Seems reasonable enough - multiplication here should be faster.

Can we pull out some of the shared logic into a function (both for Float and for Millisecond) and repeat that across the various branches?

Hm, maybe, a couple of them use logic like:

let py_timedelta = either_delta.try_into_py(py)?; let seconds: f64 = Self::total_seconds(&py_timedelta)?.extract()?;

However then the serializer needs to wrap these in map_err calls, so can't be reused there..

Maybe a rust wizard could help with some clever refactoring im not seeing 🧙🏻

To do the multiplication in Python would be something like

let object = Self::total_seconds(&py_timedelta)?.mul(1000)?;

... this requires creating a Python integer 1000, which for best performance we might want to consider caching.

But there is another inefficiency here (and in the other cases) which is that the call to try_into_py creates a Python timedelta object which then gets thrown away immediately to convert to float. It will probably be better in all cases to use .to_duration(), which will avoid the temporary Python object in the case of a Duration Rust value.

The best option would be to go further and to add a .total_seconds() method to EitherTimedelta which extracts the fractional seconds from whatever state the EitherTimedelta is currently in (doing the most efficient thing for each case) and then doing the multiplication in Rust.

hmm, yeah neat suggestion on the EitherTimedelta

I guess we'd want something like this:

impl<'a> EitherTimedelta<'a> { .... pub fn total_seconds(&self, py: Python<'a>) -> f64 { match self { Self::Raw(timedelta) => ... Self::PyExact(py_timedelta) => ... Self::PySubclass(py_timedelta) => ... } } }

And then we have two cases: Duration and PyDelta to deal with. Looks like we have some methods around that would involve getting the py_timedelta into a Duration object, so then its just Duration we need to deal with. However (maybe im reading the docs wrong!) i can't seem to see a method on there that returns the total_seconds as a float?

Probably missed something here!

pub fn total_seconds(&self) -> f64 { match self { Self::Raw(timedelta) => ..., Self::PyExact(py_timedelta) => intern!(py_timedelta.py(), "total_seconds"))?.extract? Self::PySubclass(py_timedelta) => intern!(py_timedelta.py(), "total_seconds"))?.extract? } }

Something like this? Not 100% sure what to do with the Raw case, but doing what we do currently and extracting from the python object by calling "total_seconds" on it feels like the best way in this case.

I suggest looking at the to_duration method to see how that gets days / seconds / microseconds out of the Python value, as that should have the most efficient implementation already set up to get the data out (and then you can do a bit of arithmetic). In general calling a Python method will be slow-ish, so there might be a faster step for PyExact case.

For Raw, you need to work with the speedate::Duration object, can probably get a timestamp value and microseconds and combine those.

codecov · 2024-08-30T07:41:37Z

Codecov Report

Attention: Patch coverage is 46.83544% with 42 lines in your changes missing coverage. Please review.

Project coverage is 89.22%. Comparing base (ab503cb) to head (3e23511).
Report is 178 commits behind head on main.

Files with missing lines	Patch %	Lines
src/input/datetime.rs	34.37%	42 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1427      +/-   ##
==========================================
- Coverage   90.21%   89.22%   -1.00%     
==========================================
  Files         106      112       +6     
  Lines       16339    17765    +1426     
  Branches       36       41       +5     
==========================================
+ Hits        14740    15850    +1110     
- Misses       1592     1895     +303     
- Partials        7       20      +13

Files with missing lines	Coverage Δ
python/pydantic_core/core_schema.py	`94.76% <100.00%> (-0.01%)`	⬇️
src/serializers/config.rs	`94.25% <100.00%> (-0.20%)`	⬇️
src/serializers/infer.rs	`90.94% <100.00%> (-4.18%)`	⬇️
src/serializers/type_serializers/timedelta.rs	`100.00% <100.00%> (ø)`
src/input/datetime.rs	`90.18% <34.37%> (-8.59%)`	⬇️

... and 48 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc0c97a...3e23511. Read the comment docs.

ollz272 · 2024-08-30T07:43:58Z

please review

codspeed-hq · 2024-08-30T07:46:25Z

CodSpeed Performance Report

Merging #1427 will not alter performance

_{Comparing ollz272:add-millisecond (3e23511) with main (bc0c97a)}

Summary

✅ 155 untouched benchmarks

ollz272 · 2024-09-04T10:56:36Z

Hi! Any timeline on when this could get looked at? Would be a really valuable feature for us. Thanks!

sydney-runkle · 2024-09-06T13:02:42Z

Hi! Any timeline on when this could get looked at? Would be a really valuable feature for us. Thanks!

Absolutely - will review tomorrow :).

sydney-runkle · 2024-09-10T14:57:07Z

Absolutely - will review tomorrow :).

Gah -- got behind on this with travel. Reviewing today!

sydney-runkle

Looks reasonable overall - left a couple of quick notes.

Let's get a second review from @davidhewitt re the best way to do the multiplication - he's our in house rust guru.

tests/serializers/test_any.py

sydney-runkle · 2024-09-10T19:48:43Z

src/serializers/config.rs

+                // convert to int via a py timedelta not duration since we know this this case the input would have
+                // been a py timedelta
+                let py_timedelta = either_delta.try_into_py(py)?;
+                let seconds: f64 = Self::total_seconds(&py_timedelta)?.extract()?;


Seems reasonable enough - multiplication here should be faster.

Can we pull out some of the shared logic into a function (both for Float and for Millisecond) and repeat that across the various branches?

ollz272 · 2024-09-11T16:45:58Z

@sydney-runkle I've changed the code here to better reflect the discussions in pydantic/pydantic#10293 (comment), on the refactoring im struggling to see a nice way to bring things out, but happy to apply any suggestions you or the team may have :)

ollz272 · 2024-09-11T16:48:34Z

Also happy to add the other modes here if you'd like, wouldn't be too much effort

ollz272 · 2024-09-11T16:48:45Z

please review

ollz272

This has become a bit more involved than i first thought, but enjoying it! Im sure some comments on total_seconds, maybe we also want a total_milliseconds function, but regardless i think this is in a good state now.

ollz272 · 2024-09-12T18:58:00Z

python/pydantic_core/_pydantic_core.pyi

@@ -356,7 +356,7 @@ def to_json(
    by_alias: bool = True,
    exclude_none: bool = False,
    round_trip: bool = False,
-    timedelta_mode: Literal['iso8601', 'float'] = 'iso8601',
+    timedelta_mode: Literal['iso8601', 'seconds_float', 'milliseconds_float'] = 'iso8601',


@sydney-runkle as requested in pydantic pr, i've removed float from the type hints in various places.

src/input/datetime.rs

src/serializers/config.rs

src/serializers/infer.rs

tests/serializers/test_any.py

ollz272 · 2024-09-12T19:07:15Z

Pydantic integration is failing, though i guess this is expected since we're changing some behaviour here with a deprecation..

davidhewitt

Overall looks good. There's a bit of precision being lost which I think we might be able to mitigate, see my other comment.

src/input/datetime.rs

src/serializers/config.rs

tests/serializers/test_any.py

davidhewitt · 2024-09-12T22:33:33Z

RE pydantic integration, I guess if we intend to remove the setting from pydantic-core completely and keep it as a pydantic level thing, then yes the failing test is expected.

…core into add-millisecond

ollz272 · 2024-09-13T05:35:46Z

tests/serializers/test_any.py

+        (timedelta(microseconds=1), 0.001, b'0.001', {'0.001': 'foo'}, b'{"0.001":"foo"}', 'milliseconds_float'),
+        (
+            timedelta(microseconds=-1),
+            -0.0009999999999763531,


This test (and the https://github.com/pydantic/pydantic-core/pull/1427/files#diff-83a0ae6e1d65a0cd2ffcef088f3e07274f2e38a33e89b77c6bcffe137c3e1ddaR264) seem to still have a floating point error, both are the single negative microseconds case. Curious if you have any insight on this!

ollz272 · 2024-09-17T10:18:52Z

please review

davidhewitt

Sorry for the very late reply; combination of leave and sickness.

I think I've worked out a possible solution & have posted below.

src/input/datetime.rs

davidhewitt · 2024-09-17T15:12:34Z

src/input/datetime.rs

+                let days: f64 = f64::from(py_timedelta.get_days()); // -999999999 to 999999999
+                let seconds: f64 = f64::from(py_timedelta.get_seconds()); // 0 through 86399
+                let microseconds: f64 = f64::from(py_timedelta.get_microseconds()); // 0 through 999999
+                Ok(86_400_000.0 * days + seconds * 1_000.0 + microseconds / 1_000.0)


I think to keep full precision, we need to try to keep full microsecond precision as integer arithmetic and do floating point cast at the last minute. If we work in i64 then we might overflow on the conversion from days & seconds to micros for large values of days, but in that case the precision on the microseconds won't matter much anyway.

So I get something like this:

Suggested change

let days: f64 = f64::from(py_timedelta.get_days()); // -999999999 to 999999999

let seconds: f64 = f64::from(py_timedelta.get_seconds()); // 0 through 86399

let microseconds: f64 = f64::from(py_timedelta.get_microseconds()); // 0 through 999999

Ok(86_400_000.0 * days + seconds * 1_000.0 + microseconds / 1_000.0)

let days: i64 = py_timedelta.get_days().into(); // -999999999 to 999999999

let seconds: i64 = py_timedelta.get_seconds().into(); // 0 through 86399

let microseconds = py_timedelta.get_microseconds(); // 0 through 999999

let days_seconds = (86_400 * days) + seconds;

if let Some(days_seconds_as_micros) = days_seconds.checked_mul(1_000_000) {

let total_microseconds = days_seconds_as_micros + i64::from(microseconds);

Ok(total_microseconds as f64 / 1_000.0)

} else {

// Fall back to floating-point operations if the multiplication overflows

let total_milliseconds = days_seconds as f64 * 1_000.0 + f64::from(microseconds) / 1_000.0;

Ok(total_milliseconds)

}

... and I guess we can do similar for the other cases.

Co-authored-by: David Hewitt <mail@davidhewitt.dev>

ollz272 · 2024-09-17T18:10:08Z

Sorry for the very late reply; combination of leave and sickness.

I think I've worked out a possible solution & have posted below.

Nice!!! I've applied that change and altered it for the other cases, seems to have removed all the precision issues. Should be good now

davidhewitt

Fantastic, thanks for the many rounds of iteration here; this looks great to me!

cc @sydney-runkle if you are happy with moving the float option to be in pydantic only, then this is ready to merge👍

sydney-runkle

Great work here - thanks for sticking with this through some significant (api and code) changes.

Parametrized test looks great - thanks!

…arameter (#1427)" This reverts commit e0b4c94.

feat: add 'millisecond' option to ser_json_timedelta config parameter

0107915

ollz272 commented Aug 30, 2024

View reviewed changes

pydantic-hooky bot added the ready for review label Aug 30, 2024

pydantic-hooky bot assigned sydney-runkle Aug 30, 2024

ollz272 mentioned this pull request Sep 3, 2024

Add millisecond option to ConfigDict.ser_json_timedelta pydantic/pydantic#10293

Merged

5 tasks

Merge branch 'main' into add-millisecond

21be1a2

sydney-runkle requested changes Sep 10, 2024

View reviewed changes

pydantic-hooky bot added awaiting author revision and removed ready for review labels Sep 10, 2024

pydantic-hooky bot assigned ollz272 and unassigned sydney-runkle Sep 10, 2024

fix: add tests

537a484

ollz272 force-pushed the add-millisecond branch from 7a996db to 537a484 Compare September 11, 2024 06:08

ollz272 added 3 commits September 11, 2024 17:42

fix: add support seconds_float

f7a4008

fix: add support seconds_float

f5df4d4

fix: add support seconds_float

367da21

pydantic-hooky bot added ready for review and removed awaiting author revision labels Sep 11, 2024

pydantic-hooky bot assigned sydney-runkle and unassigned ollz272 Sep 11, 2024

ollz272 added 5 commits September 12, 2024 19:44

fix: full tests

c38fce0

fix: final bits, hopefully ok

abe32e6

fix: remove float support, will be handled by pydantic for deprecation

9241bd7

fix: some bits

61971a0

fix: remove a deprecation message

9aea492

ollz272 commented Sep 12, 2024

View reviewed changes

ollz272 requested review from davidhewitt and sydney-runkle September 12, 2024 19:06

Merge branch 'main' into add-millisecond

6d54bd8

davidhewitt reviewed Sep 12, 2024

View reviewed changes

ollz272 added 5 commits September 13, 2024 06:09

fix: remove redundant comment

bff8e3b

fix: remove type hint

00e68b7

fix: simplify using Davids suggestion

b4c86de

chore: Merge branch 'add-millisecond' of github.com:ollz272/pydantic-…

574a011

…core into add-millisecond

fix: some floating point bits

a3900f6

ollz272 commented Sep 13, 2024

View reviewed changes

ollz272 requested a review from davidhewitt September 13, 2024 08:37

davidhewitt reviewed Sep 17, 2024

View reviewed changes

ollz272 and others added 2 commits September 17, 2024 18:48

Update src/input/datetime.rs

fb80b83

Co-authored-by: David Hewitt <mail@davidhewitt.dev>

fix: fix precision issues using davids method

d313c90

Merge branch 'main' into add-millisecond

3e23511

ollz272 requested a review from davidhewitt September 18, 2024 05:13

davidhewitt approved these changes Sep 18, 2024

View reviewed changes

sydney-runkle approved these changes Sep 18, 2024

View reviewed changes

sydney-runkle merged commit e0b4c94 into pydantic:main Sep 18, 2024
29 of 30 checks passed

sydney-runkle added a commit that referenced this pull request Oct 25, 2024

Revert "feat: add 'millisecond' option to ser_json_timedelta config p…

c4c2ee2

…arameter (#1427)" This reverts commit e0b4c94.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add 'millisecond' option to ser_json_timedelta config parameter #1427

feat: add 'millisecond' option to ser_json_timedelta config parameter #1427

ollz272 commented Aug 30, 2024 •

edited by pydantic-hooky bot

Loading

ollz272 Aug 30, 2024

sydney-runkle Sep 10, 2024

ollz272 Sep 11, 2024 •

edited

Loading

davidhewitt Sep 12, 2024 •

edited

Loading

ollz272 Sep 12, 2024

ollz272 Sep 12, 2024 •

edited

Loading

davidhewitt Sep 12, 2024

codecov bot commented Aug 30, 2024 •

edited

Loading

ollz272 commented Aug 30, 2024

codspeed-hq bot commented Aug 30, 2024 •

edited

Loading

ollz272 commented Sep 4, 2024

sydney-runkle commented Sep 6, 2024

sydney-runkle commented Sep 10, 2024

sydney-runkle left a comment

sydney-runkle Sep 10, 2024

ollz272 commented Sep 11, 2024

ollz272 commented Sep 11, 2024

ollz272 commented Sep 11, 2024

ollz272 left a comment

ollz272 Sep 12, 2024

ollz272 commented Sep 12, 2024

davidhewitt left a comment

davidhewitt commented Sep 12, 2024

ollz272 Sep 13, 2024

ollz272 commented Sep 17, 2024

davidhewitt left a comment

davidhewitt Sep 17, 2024

ollz272 commented Sep 17, 2024

davidhewitt left a comment

sydney-runkle left a comment

-                let days: f64 = f64::from(py_timedelta.get_days()); // -999999999 to 999999999
-                let seconds: f64 = f64::from(py_timedelta.get_seconds()); // 0 through 86399
-                let microseconds: f64 = f64::from(py_timedelta.get_microseconds()); // 0 through 999999
-                Ok(86_400_000.0 * days + seconds * 1_000.0 + microseconds / 1_000.0)
+                let days: i64 = py_timedelta.get_days().into(); // -999999999 to 999999999
+                let seconds: i64 = py_timedelta.get_seconds().into(); // 0 through 86399
+                let microseconds = py_timedelta.get_microseconds(); // 0 through 999999
+                let days_seconds = (86_400 * days) + seconds;
+                if let Some(days_seconds_as_micros) = days_seconds.checked_mul(1_000_000) {
+                    let total_microseconds = days_seconds_as_micros + i64::from(microseconds);
+                    Ok(total_microseconds as f64 / 1_000.0)
+                } else {
+                    // Fall back to floating-point operations if the multiplication overflows
+                    let total_milliseconds = days_seconds as f64 * 1_000.0 + f64::from(microseconds) / 1_000.0;
+                    Ok(total_milliseconds)
+                }

feat: add 'millisecond' option to ser_json_timedelta config parameter #1427

feat: add 'millisecond' option to ser_json_timedelta config parameter #1427

Conversation

ollz272 commented Aug 30, 2024 • edited by pydantic-hooky bot Loading

Change Summary

Related issue number

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ollz272 Sep 11, 2024 • edited Loading

Choose a reason for hiding this comment

davidhewitt Sep 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ollz272 Sep 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Aug 30, 2024 • edited Loading

Codecov Report

ollz272 commented Aug 30, 2024

codspeed-hq bot commented Aug 30, 2024 • edited Loading

Merging #1427 will not alter performance

Summary

ollz272 commented Sep 4, 2024

sydney-runkle commented Sep 6, 2024

sydney-runkle commented Sep 10, 2024

sydney-runkle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ollz272 commented Sep 11, 2024

ollz272 commented Sep 11, 2024

ollz272 commented Sep 11, 2024

ollz272 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ollz272 commented Sep 12, 2024

davidhewitt left a comment

Choose a reason for hiding this comment

davidhewitt commented Sep 12, 2024

Choose a reason for hiding this comment

ollz272 commented Sep 17, 2024

davidhewitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ollz272 commented Sep 17, 2024

davidhewitt left a comment

Choose a reason for hiding this comment

sydney-runkle left a comment

Choose a reason for hiding this comment

ollz272 commented Aug 30, 2024 •

edited by pydantic-hooky bot

Loading

ollz272 Sep 11, 2024 •

edited

Loading

davidhewitt Sep 12, 2024 •

edited

Loading

ollz272 Sep 12, 2024 •

edited

Loading

codecov bot commented Aug 30, 2024 •

edited

Loading

codspeed-hq bot commented Aug 30, 2024 •

edited

Loading