Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#100 Propagate meta dict to Nested dataclasses #106

Closed
wants to merge 4 commits into from

Conversation

wanderrful
Copy link

Hi there,

This PR changes default behavior for Nested dataclasses so that the meta dict that we pass into desert.schema will be propagated to all child dataclasses.

I've also taken the liberty of adding a test to demonstrate that it works as intended.

If you feel that this should not be default behavior, we can make this functionality live behind a feature flag meta dict that we can branch off of.

Fixes #100

@codecov
Copy link

codecov bot commented Sep 17, 2020

Codecov Report

Merging #106 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #106   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            6         6           
  Lines          446       456   +10     
  Branches        69        72    +3     
=========================================
+ Hits           446       456   +10     
Impacted Files Coverage Δ
src/desert/_make.py 100.00% <100.00%> (ø)
tests/test_make.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 872dcb9...762b8ed. Read the comment docs.

@@ -130,7 +130,7 @@ def class_schema(
attributes[field.name] = field_for_schema(
hints.get(field.name, field.type),
_get_field_default(field),
field.metadata,
{**meta, **field.metadata},
Copy link
Author

@wanderrful wanderrful Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we're using field.metadata second here, the parent Schema's meta dict will not overwrite a Nested field's potential metadata params.

@@ -289,7 +289,8 @@ def field_for_schema(

if field is None:
nested = forward_reference or class_schema(typ)
field = marshmallow.fields.Nested(nested)
params = {k: v for k, v in metadata.items() if type(k) is str}
Copy link
Author

@wanderrful wanderrful Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are filtering for string keys, we don't have to worry about the Sentinel key breaking any typing expectations downstream.

@@ -289,7 +289,8 @@ def field_for_schema(

if field is None:
nested = forward_reference or class_schema(typ)
field = marshmallow.fields.Nested(nested)
params = {k: v for k, v in metadata.items() if type(k) is str}
field = marshmallow.fields.Nested(nested, **params)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the Nested class accepts **kwargs in its signature, we don't have to worry about extraneous params breaking anything downstream.

@wanderrful
Copy link
Author

I think your Travis pipeline might have an issue, as it's blocking the final piece of the CI:

image

@wanderrful
Copy link
Author

wanderrful commented Sep 18, 2020

Posted PR #107 to fix the Travis CI bug

edit: Did not notice that that PR was already posted by someone else

@wanderrful
Copy link
Author

wanderrful commented Sep 25, 2020

@altendky Would you like me to refactor this into a feature flag that we can check for in the meta dict of the desert.Schema constructor? We could call it something like meta={"unknown": marshmallow.EXCLUDE, "propagate": True} and just assume false if none is specified.

@python-desert
Copy link
Collaborator

python-desert commented Sep 25, 2020

We need to know {"unknown": mm.EXCLUDE} when we're deserializing (calling load()). Marshmallow's design has us put that information two levels up: not when we're deserializing (calling the deserializer: load()), not when we're building the deserializer (instantiating the class: MySchema()), but when we're building the builder for the deserializer (instantiating the metaclass: class MySchema(mm.Schema)). The same is true for all Marshmallow Meta options. So it seems the Meta options are about de/serialization. If we want to parametrize the instantiation of the class, rather than the invocation of an instance method, I'd think that information should go elsewhere, not share namespace with the loading options. But maybe (1) my claim about what Meta is used for is wrong, or (2) we should break convention and mix parameter spaces anyway.

@altendky
Copy link
Member

It seems like maybe we should be able to specify some propagating settings and some not. Like separate meta and propagating_meta options. I dunno, it half feels like silly overkill but it is also explicit and flexible and shouldn't introduce any redundant specification, I don't think. But I really need to make time to get back into this project... :[

@python-desert, I'm not quite following, sorry. The meta class seems to be mostly about serialization with a little deserialization. As you said, it needs to be passed pretty earlier in this curious process of classes that get instantiated to seemingly hold no state (that I have dealt with) and then have methods called on them. But I'm not clear if an error is being pointed out about this proposal.

@wanderrful
Copy link
Author

wanderrful commented Sep 26, 2020

If it helps illustrate, I'd like to clarify the motivation for this proposed change:

The primary use case behind this feature is that when we work with DTOs (e.g. HTTP requests or RabbitMQ messages) and DAOs (e.g. Mongo documents or Redis maps), we cannot always trust that the response objects we receive back will perfectly align with what we expect. This is the reason why the {"unknown": marshmallow.EXCLUDE} feature exists. However, in practice the DTOs and DAOs that we work with often include nested objects.

The current design with Desert (and with Marshmallow) only allows us to use the marshmallow.EXCLUDE feature for our top-level dataclasses. We do not currently have a convenient way to clarify our intent when we create our schema classes that our nested dataclasses should also share that same functionality. I don't think it is DRY to have to mutate each schema that we create for this use case. It'd be much better, in my opinion, to have a way to specify that in the constructor/builder.


About @python-desert 's comment -- If we don't want to pass along these params, then another solution I could refactor this into would be to just change this to be a branch such that only the "unknown" key is passed down when we have our propagate flag present in the schema's builder method or class constructor.

@altendky
Copy link
Member

I agree it has value. It certainly plays into APIs where adding new items is allowed and considered backward compatible as old clients are expected to ignore them. And I'm sure various other cases.

@python-desert
Copy link
Collaborator

What happens when A.Meta.unknown and A.b.Meta.unknown disagree? A wins, B wins, error?

@python-desert
Copy link
Collaborator

The ones we'd want to push are the ones that are a policy, like "how do we handle unknown keys". Maybe there are others.

Also maybe worth talking about alternative places to set policy, such as schema = push_policy_recursively(MySchema(), unknown=mm.EXCLUDE) etc.

@wanderrful
Copy link
Author

What happens when A.Meta.unknown and A.b.Meta.unknown disagree? A wins, B wins, error?

This is a really good point that I hadn't considered fully. Because of this counterexample, maybe the right thing to do would be for this to be a setting that we can change in Desert for everything, rather than to just specify it by schema.

For my own personal use cases, I'd argue that the more permissive option should win but I'm not sure if that should be the policy.

@wanderrful
Copy link
Author

wanderrful commented Dec 28, 2020

Just an idea -- wouldn't it be sufficient to just subclass the dataclass in question to effectively duplicate it and ensure that its Schema can have multiple settings?

For example:

@dataclass
class MorePermissive:
  whatever: int

@dataclass
class LessPermissive(MorePermissive):
  pass

foo = desert.schema(MorePermissive)
bar = desert.schema(LessPermissive, meta={"unknown": marshmallow.EXCLUDE})

The benefit we get from this approach is that we can have multiple schemas for the same dataclass and use the most permissive version of it for function argument purposes.

@wanderrful wanderrful closed this Feb 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unknown Excluded in sublists
3 participants