Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple different snapshot functionality #754

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

rotu
Copy link

@rotu rotu commented Jun 3, 2023

Decouple different snapshot functionality

Description

e.g. The plugin architecture is defined with multiple subparts. This plugin allows implementing those separately - e.g. you can use a different serializer from comparator.

Related Issues

Checklist

  • This PR has sufficient documentation.
  • This PR has sufficient test coverage.
  • This PR title satisfies semantic convention.

Additional Comments

No additional comments.

…ifferent parts of the extension interface separately.
@rotu
Copy link
Author

rotu commented Jun 3, 2023

This might be a very bad idea, but it also seems like it might be in line with the original plugin architecture intent. Curious your thoughts.

@noahnu
Copy link
Collaborator

noahnu commented Jun 5, 2023

Will give this more thought at some point this week. Initial thought is that this should be doable with a custom extension like so (not tested, but I think this should work)

from syrupy.extensions.amber import AmberDataSerializer
from syrupy.extensions.single_file import SingleFileSnapshotExtension

class SingleFileAmberSnapshotExtension(SingleFileSnapshotExtension):
    def serialize(self, data, **kwargs) -> "SerializedData":
        return AmberDataSerializer.serialize(data, **kwargs)

@rotu
Copy link
Author

rotu commented Jun 5, 2023

Will give this more thought at some point this week. Initial thought is that this should be doable with a custom extension like so (not tested, but I think this should work)

from syrupy.extensions.amber import AmberDataSerializer
from syrupy.extensions.single_file import SingleFileSnapshotExtension

class SingleFileAmberSnapshotExtension(SingleFileSnapshotExtension):
    def serialize(self, data, **kwargs) -> "SerializedData":
        return AmberDataSerializer.serialize(data, **kwargs)

I think you're right but it's very confusing to me how the library is set up. It seems like there are two broad philosophies to plugins:

  1. Inherit from the base thingy and override certain functions with a new implementation.
  2. Inject new implementation into well-defined places for hooks.

Inheriting from an extension and overriding seems weird to me because it's doing both.


That said, there's a lot I don't understand about this codebase which prevents me from being confident in this PR:

  • what "amber" means in "AmberSnapshot" and why it's used instead of pickle (I'm guessing because pickle is not text-based).
  • why snapshot takes extension_class instead of extension
  • why some extension methods like get_snapshot_name, write_snapshot are class methods instead of instance methods
  • why SnapshotAssertion mutates itself in _post_assert.

@noahnu
Copy link
Collaborator

noahnu commented Jun 5, 2023

Inheriting from an extension and overriding seems weird to me because it's doing both.

I agree with that. There's definitely room to improve here. The codebase has already gone through several refactors, though it's pretty stable right now so I'd be hesitant to change anything too drastic without good reason. We also want to reduce the number of breaking changes if we can help it.

what "amber" means in "AmberSnapshot" and why it's used instead of pickle (I'm guessing because pickle is not text-based).

Amber is a custom format we built intended to reduce the size of diffs and be easy to review in pull requests. We don't need the ability to deserialize, only serialize. The name is a play on words with the theme of "syrupy" that used to be more prevalent in the codebase. There was a strong Jurassic Park theme (syrupy = amber / fossils etc). We've expunged most of this from the codebase to improve readability.

why snapshot takes extension_class instead of extension

The extension defines an interface so to speak. It does not have any instance state.

why some extension methods like get_snapshot_name, write_snapshot are class methods instead of instance methods

There was some recent refactoring to support pytest-xdist. Moving off of instances made it a lot simpler to reason about. Helps with performance as well. We cache some function calls using the extension class type. Not something we could easily do if there was instance data we had to consider.

why SnapshotAssertion mutates itself in _post_assert.

It lets you customize specific assertions. The post_assert resets the snapshot fixture in between assertions. Lets you do things like:

def test_case(snapshot):
    assert x == snapshot(exclude=props("uuid"))
    assert y == snapshot(exclude=props("other_prop"))

@rotu
Copy link
Author

rotu commented Jun 6, 2023

why snapshot takes extension_class instead of extension
The extension defines an interface so to speak. It does not have any instance state.

True, though that's beside the point. One place where this is awkward is with SingleFileSnapshotExtension which must be subclassed in order to use TEXT mode. If you passed an extension object, you could add it as an argument to the constructor, e.g. something like snapshot(extension=SingleFileSnapshotExtension(text=True)).

There was some recent refactoring to support pytest-xdist. Moving off of instances made it a lot simpler to reason about. Helps with performance as well. We cache some function calls using the extension class type. Not something we could easily do if there was instance data we had to consider.

This makes sense. Something I'll have to look into.

why SnapshotAssertion mutates itself in _post_assert.

It lets you customize specific assertions. The post_assert resets the snapshot fixture in between assertions. Lets you do things like:

 def test_case(snapshot):
     assert x == snapshot(exclude=props("uuid"))
     assert y == snapshot(exclude=props("other_prop"))

I see that. The more obvious way to do this seems to be to have SnapshotAssertion.__call__ return a new object (similar to how use_extension creates and returns a new SnapshotAssertion). One consequence of the current setup is it's not at all clear what the following does:

def test_case(snapshot):
   s = snapshot(exclude=props("a"))
   assert x == s
   assert y == s

It might make sense to use dataclasses.replace to implement both __call__ and use_extension.

@noahnu
Copy link
Collaborator

noahnu commented Sep 27, 2023

Just posting to let you know I haven't forgotten about this. I do agree there's plenty of opportunity to simplify the architecture. I'd prefer to try more incremental changes in advance of any larger public API change (to reduce breaking changes and adoption pains).

I'm not sure I have too much time over the next few months to support large scale changes though.

FWIW, a lot of the initial implementation was based on attrs before dataclasses, and when we moved to dataclasses we only did the bare minimum so it was functional. Leaning on newer patterns to simplify the code and make it easier to contribute to is definitely something I'm interested in.

And if you're curious, you were 100% right about the initial design for the extensions/plugins being closer to the snapshot(storage=..., serializer=...) model (if you look at some of the earlier commits in this repo, you'll see something a bit closer to that).

@StephanMeijer
Copy link

Will give this more thought at some point this week. Initial thought is that this should be doable with a custom extension like so (not tested, but I think this should work)

from syrupy.extensions.amber import AmberDataSerializer
from syrupy.extensions.single_file import SingleFileSnapshotExtension

class SingleFileAmberSnapshotExtension(SingleFileSnapshotExtension):
    def serialize(self, data, **kwargs) -> "SerializedData":
        return AmberDataSerializer.serialize(data, **kwargs)

Only .encode() at the end sees to be missing. Only issue: the Amber Extension gives proper diffs. How to get those back?

@StephanMeijer
Copy link

The following snippet works beautifully:

class SingleFileAmberSnapshotExtension(SingleFileSnapshotExtension):
    _file_extension = 'ambr'
    _write_mode = WriteMode.TEXT

    def serialize(self, data, **kwargs):
        return AmberDataSerializer.serialize(data, **kwargs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants