Use the original Node when creating traits #1047

mtdowling · 2022-01-10T21:11:47Z

This commit updates traits to use the originally provided node value in a model file so that the value is persisted as-is when calling toNode on a trait.

We don't currently do a lot, if any, validation in trait node provider methods. We then explicitly take the parts out of traits that we want and ignore the rest of the node object. This results in use dropping additional properties on traits rather than warning about their presence, which makes it harder to know when a property has a typo. For example, see https://github.com/awslabs/smithy/blob/main/smithy-model/src/main/java/software/amazon/smithy/model/traits/HttpTrait.java#L50. Notice that no additional property validation is done on the trait. This is common across most traits.

    public static final class Provider extends AbstractTrait.Provider {
        public Provider() {
                super(ID);
        }

        @Override
        public Trait createTrait(ShapeId target, Node value) {
            HttpTrait.Builder builder = builder().sourceLocation(value);
            ObjectNode members = value.expectObjectNode();
            builder.uri(UriPattern.parse(members.expectStringMember("uri").getValue()));
            builder.method(members.expectStringMember("method").getValue());
            builder.code(members.getNumberMember("code")
                                 .map(NumberNode::getValue)
                                 .map(Number::intValue)
                                 .orElse(200));
            return builder.build();
        }
    }

By keeping the original node as-is, the TraitValueValidator can automatically detect additional properties.

In addition to validation, this change also removes the need to duplicate the work of creating node values for traits. For example, 412,359 fewer node values are created when loading all AWS models.

The tradeoff of this change is that if a trait does any kind of normalization of values or setting of defaults or if a trait has logic around omitting empty lists, false, etc, then traits may need to override equals and hashCode since relying on toNode equality would render two traits inequal. That generally is fine, but it may be a problem if a model transformation is performed and you need to compare two models, you need to merge two models that might contain duplicate shapes as a result of a transform, etc.

An alternative to this PR is to add things like warnIfAdditionalProperties to every trait provider that takes objects. A middleground could be reached too, where we keep the original node values for scalars and lists of scalars, but recompute and normalize the node values of traits that use object nodes.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This commit updates traits to use the originally provided node value in a model file so that the value is persisted as-is when calling toNode on a trait. This prevents needing to duplicate the work of creating the trait and helps expose issues like typos in traits because invalid properties can be detected when validating traits against the model. The tradeoff of this change is that if a trait does any kind of normalization of values or setting of defaults or if a trait has logic around omitting empty lists, false, etc, then traits may need to override equals and hashCode since relying on toNode equality would render two traits inequal. That generally is fine, but it may be a problem if a model transformation is performed and you need to compare two models, you need to merge two models that migth contain duplicate shapes as a result of a transform, etc.

JordonPhillips · 2022-01-11T16:18:11Z

The PR itself looks good, but this seems like something that will be extremely easy to miss when developing a trait. Before the failure mode was we omitted excess properties when reserializing, but now it's a mysterious inequality issue that may never be apparent which could cause values to be overwritten.

mtdowling · 2022-01-11T23:37:53Z

Before the failure mode was we omitted excess properties when reserializing, but now it's a mysterious inequality issue that may never be apparent which could cause values to be overwritten.

If you don't pre-populate the node cache, then the equality rules are exactly as they are today (that is, normalized by using the toNode method).

Values wouldn't be overwritten or anything due to inequality. The one failure case this could introduce is a trait conflict. Here's a contrived example:

We take this change.
We populate the trait cache for enum traits.
A model defines a shape that has an enum trait, and it uses empty arrays for the tags property of each enum.
The model is modified in some way and reserialized. This reserailization results in the tags property now being omitted because it's empty. The toNode method does this and doesn't care that the original trait used an empty array node.
IFF someone had a dependency closure that looped in the original shape with the enum trait, and they also included the modified model, then they'd see trait conflicts between the enum traits because one used and empty array for tags and one omitted the tags array.

mtdowling requested a review from a team as a code owner January 10, 2022 21:11

JordonPhillips approved these changes Jan 12, 2022

View reviewed changes

kstich approved these changes Jan 20, 2022

View reviewed changes

mtdowling merged commit cec9352 into main Jan 20, 2022

mtdowling deleted the use-provider-traits branch April 8, 2022 05:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use the original Node when creating traits #1047

Use the original Node when creating traits #1047

mtdowling commented Jan 10, 2022

JordonPhillips commented Jan 11, 2022

mtdowling commented Jan 11, 2022

Use the original Node when creating traits #1047

Use the original Node when creating traits #1047

Conversation

mtdowling commented Jan 10, 2022

JordonPhillips commented Jan 11, 2022

mtdowling commented Jan 11, 2022