Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement optimized Serialize and Deserialize for Schedule #129

Merged
merged 6 commits into from
Oct 29, 2024

Conversation

LeoniePhiline
Copy link
Contributor

@LeoniePhiline LeoniePhiline commented Oct 28, 2024

Scope

This PR is meant to supersede #118.

Its goal is to implement serde's Serialize and Deserialize for Schedule.

This feature request has been mentioned in #67 (comment) but is not otherwise recorded in the project issues.

Commits

45115e1Add Serde compatiblity for type Schedule

This is a rebase of #118 by @max-huster onto the current master.

abdb525fix: impl Deserialize behind feature gate

This change fixes #118 by guarding impl Deserialize for Schedule behind the "serde" feature flag.

7520221feat: implement optimized Serialize and Deserialize for Schedule

This changeset improves upon #118 by making use of the recently merged #128:

Serialization

Serialization uses a direct reference into the Schedule::source.

Previously, the blanket implementation of ToString::to_string had been used, which unnecessarily allocated a string and leveraged format! machinery.

Deserialization

Deserialization suggests the Deserializer to provide an owned String to the visitor, as the Schedule will take ownership of the String after successful decoding.

Deserializers not incapable of providing owned Strings may pass a &str, which is decoded, then cloned into an owned String for storage in Schedule::source.

Public API changes

Beside implementing Serialize and Deserialize behind the "serde" feature flag, this changeset also adds a new public method Schedule::source(&self) -> &str.

This new public method is generally useful and therefore not guarded by the "serde" feature flag.

3cb0fa3feat(ci): run tests for default features and all features separately

The tests for #[cfg(feature = "serde")] had not been run in CI.

This change creates separate CI jobs for testing with default features and testing with all features, to spot incompatibilities, dead code and incorrect feature gates.

365409dstyle: normalize serde tests

This stylistic change prepares for the upcoming addition of end-to-end deserialized schedule event iterator tests by renaming the existing serde_test-based tests by adding an idicator that only the token representation is tested.

The changeset also lowercases failure messages, and renames variables to match crate conventions.

dede2c7test: e2e-test deserialized schedules with binary format

This changeset implements serde serialization and deserialization tests beyond the token representation tests provided by serde_test.

Schedules are serialized to a binary format, then deserialized and used to yield events following a given date and time. These events are compared to a limited set of restricted upcoming events.

The tests cover shorthand as well as ranged period cron schedule notation.

Postcard was chosen over serde_json and serde_yml as it is fast to compile, efficient to run, and because its serialization target format is not text based but binary. Default features are disabled.

Open questions

Solved: end-to-end tests with (binary) storage format added in dede2c7.

In #118 (review) you had asked for some unit tests for patterns beyond "* * * * * * *".

I wonder if this request still holds? Decoding of patterns is unit-tested elsewhere.

The serialization and deserialization unit tests appear to me to be covered by a successful serde_test::assert_tokens and an intentionally failing serde_test::assert_de_tokens_error.~~

If you would like to see further patterns used in the serde tests regardless, then I would ask you to point me to specifics regarding which kinds of patterns you would like to see tested. I would then push a fourth commit adding the requested tests.

Serialization:

Serialization uses a direct reference into the
`Schedule::source`.

Previously, the blanket implementation of
`ToString::to_string` had been used,
which unnecessarily allocated a string
and leveraged `format!` machinery.

Deserialization:

Deserialization suggests the `Deserializer`
to provide an owned `String` to the visitor,
as the `Schedule` will take ownership of the
`String` after successful decoding.

`Deserializer`s not incapable of providing
owned `String`s may pass a `&str`, which is
decoded, then cloned into an owned `String`
for storage in `Schedule::source`.

Public API changes:

Beside implementing `Serialize` and `Deserialize`
behind the `"serde"` feature flag,
this changeset also adds a new public method
`Schedule::source(&self) -> &str`.

This new public method is generally useful
and therefore not guarded by the `"serde"`
feature flag.
@zslayton
Copy link
Owner

This looks great!

In #118 (review) you had asked for some unit tests for patterns beyond "* * * * * * *".

I wonder if this request still holds? Decoding of patterns is unit-tested elsewhere.

The serialization and deserialization unit tests appear to me to be covered by a successful serde_test::assert_tokens and an intentionally failing serde_test::assert_de_tokens_error.
If you would like to see further patterns used in the serde tests regardless, then I would ask you to point me to specifics regarding which kinds of patterns you would like to see tested. I would then push a fourth commit adding the requested tests.

My concern on this front was that it would be possible for the serde impl to erroneously match part of the input string (for example, reading * * * * * * * as * * * * * *) and report success even though going on to use it would produce unexpected results. Having unit tests that used a specific starting .after(datetime) and confirming that the next() fire time was correct would demonstrate not only that the deserializer accepted the string, but that the whole process worked end-to-end as well. I agree that extensive testing of the decoding is not necessary. It's possible that serde_test covers the input matching concern. What do you think?

- normalize test names,
  indicating token verification
- [lowercase][C-GOOD-ERR] failure messages

[C-GOOD-ERR]: https://rust-lang.github.io/api-guidelines/interoperability.html?highlight=lowercase#error-types-are-meaningful-and-well-behaved-c-good-err
@LeoniePhiline
Copy link
Contributor Author

LeoniePhiline commented Oct 29, 2024

My concern on this front was that it would be possible for the serde impl to erroneously match part of the input string (for example, reading * * * * * * * as * * * * * *) and report success even though going on to use it would produce unexpected results.

You are raising a good point. While the current implementation would not allow this category of bug, this fact is purely an implementation detail.

Unit tests are - among other uses - meant to guard against breakage when refactoring code. The proper functioning of deserialized schedules should indeed be smoke-tested to some extent.

Having unit tests that used a specific starting .after(datetime) and confirming that the next() fire time was correct would demonstrate not only that the deserializer accepted the string, but that the whole process worked end-to-end as well. I agree that extensive testing of the decoding is not necessary.

In dede2c7 I have implemented an events iterator test on deserialized schedules created from two kinds of representations:

  • a shorthand notation
  • a full notation with periods

I have also updated the PR description to account for the two added commits.

It's possible that serde_test covers the input matching concern. What do you think?

I agree with you that this is not (at all) the case:

The two serde_test tests only cover the token representation used by serde internally when coordinating between the in-memory format (impl Serialize / impl Deserialize) and the storage format (impl Serializer / impl Deserializer).

However, for a full fidelity test, an actual storage format implementation is required.

I reviewed serde_json, serde_yml and postcard for their suitability and chose postcard, mostly for it implementing a binary format, which is less trivial than the two string-based formats. It is also relatively small, and fast to compile.

The storage format implementation is meant to be an optional dev-dependency enabled by the "serde" feature flag. However, optional dev-dependencies are not supported yet. A feature request is tracked at rust-lang/cargo#1596.

This changeset implements serde serialization
and deserialization tests beyond
the token representation tests provided
by `serde_test`.

Schedules are serialized to a binary format
([`postcard`][postcard]), then deserialized
and used to yield events following a given date
and time. These events are compared to a
limited set of restricted upcoming events.

The tests cover shorthand as well as
ranged period cron schedule notation.

[postcard]: https://docs.rs/postcard/
Copy link
Owner

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@zslayton zslayton merged commit c5d5589 into zslayton:master Oct 29, 2024
1 check passed
@LeoniePhiline LeoniePhiline mentioned this pull request Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants