Rewrite IDL v2 model loading #1317

mtdowling · 2022-07-27T19:35:31Z

Model loader is now done through a central Consumer that
receives "operations" from model files. These operations
are things like "addShape", "applyTrait", "addForwardReference",
etc. The consumer processes these events and then coordinates
the traits loaded in the model, shapes added, and how they
are all assembled together.

This approach is able to maintain shapes in-tact when they are
added to a ModelAssembler up until the point in which the shape
might be modified by an applyTrait statement or if another trait
is added that conflicts with it. This should improve performance
and reduce memory usage when building projections with SmithyBuild
since it will cut down on having to deconstruct and the reconstruct
previously built shapes.

Other things to note:

Ensure a version is emitted from the IDL even if one is not
set so that the assembler can know to upgrade 1.0 shapes.
Add an implicit @box trait to members that were nullable in
1.0 so that tooling can know if the member was considered
nullable in 1.0 semantics (and this check ignores the
@required trait to meet 1.0 expectations).
Now that there are more synthetic traits, the Upgrade12Command
needs to not try to rewrite and erase deprecated synthetic
traits since they don't actually appear in the model. No
new test case was needed.
Ensure that the @default trait is not allowed in 1.0 models.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

mtdowling · 2022-07-27T19:52:16Z

smithy-model/src/main/java/software/amazon/smithy/model/loader/TopologicalShapeSort.java

@@ -57,7 +65,14 @@ public void enqueue(Shape shape) {
     * @param dependencies Dependencies of the shape.
     */
    public void enqueue(ShapeId shape, Collection<ShapeId> dependencies) {
-        forwardDependencies.put(shape, new LinkedHashSet<>(dependencies));
+        if (dependencies.isEmpty()) {


I realized we can jump the line and just push straight into satisfiedShapes as an optimization

mtdowling · 2022-08-02T22:20:03Z

Rebased off of latest idl-2.0 branch changes

@required

Model loader is now done through a central Consumer that receives "operations" from model files. These operations are things like "addShape", "applyTrait", "addForwardReference", etc. The consumer processes these events and then coordinates the traits loaded in the model, shapes added, and how they are all assembled together. This approach is able to maintain shapes in-tact when they are added to a ModelAssembler up until the point in which the shape might be modified by an applyTrait statement or if another trait is added that conflicts with it. This should improve performance and reduce memory usage when building projections with SmithyBuild since it will cut down on have to deconstruct and the reconstruct previously built shapes. Other things to note: * Ensure a version is emitted from the IDL even if one if not set so that the assembler can know to upgrade 1.0 shapes. * Add an implicit @box trait to members that were nullable in 1.0 so that tooling can know if the member was considered nullable in 1.0 semantics (and this check ignores the @required trait to meet 1.0 expectations). * Now that there are more synthetic traits, the Upgrade12Command needs to not try to rewrite and erase deprecated synthetic traits since they don't actually appear in the model. No new test case was needed. * Ensure that the @default trait is not allowed in 1.0 models. * Ignore invalid doc comments within shapes * Ignore 1.0 deprecation warnings in test runners The Smithy test runner no longer requires 1.0 deprecations or trait deprecations to be listed explicitly. This allows existing model test suites to contine to run without needing to be updated. It also allows trait vendors to deprecated a trait without breaking consumers. To accommodate this, I modified all warnings about 1.0 models to use a "ModelDeprecation" event ID, including set deprecation.

We previously rewrote models that target smithy.api#Primtive* shapes to target their prelude counterparts that do not have the Primitive prefix. For example, smithy.api#PrimitiveInteger would become smithy.api#Integer. The rationale being that because optionality is now controlled on members rather than based on the shape targeted by a member, there is no longer a reason to use Primitive* shapes in 2.0 models. The problem is that rewriting the shape targets during the loading process can remove information that tooling might need to determine if a shape was considered optional in IDL 1.0 (usually in fringe cases). If a member previously targets PrimitiveInteger and is rewritten to target Integer, but the member is also marked as required, then the ModelUpgrader can't add the default trait to the member. If optionality isn't determined based on the required trait, then a tool has no way to know that a member that targets Integer previously targeted PrimitiveInteger, thereby losing information. Instead of removing Primitive* shapes, we will keep them in 2.0 and mark them as deprecated. To point this out to end users, I added validation (that really should have already existed) to detect when a shape refers to a deprecated shape. However, because we don't want to break previous test cases and because models should be able to deprecate shapes without breaking test cases, DeprecatedShape does not need to be explicitly handled by Smithy errorfile test runners. This matches the special casing added for DeprecatedTrait and ModelDeprecation.

This commit tightens the grammar of the IDL to be a bit more restrictive and less ambiguous. Of note: * Several rules were updated to require spaces between productions. For example `strutureFoo{}` is no longer technically valid. `usesmithy.api#Foo` is no longer valid. * control statements must be on a single line. For example, `$version:\n"1.0"` is no longer valid. Control statements must also be followed by a line break. * metadata statements must be on a single line. `metadata foo\n="bar"` is no longer valid. Metadata statements must be followed by a line break. * use statements must be on a single line too and followed by a line break. * Specify that newlines are allowed in strings * Updated grammar rules to CamelCase to be compatible with ABNF * Adopted RFC 7405 to specify case sensitive strings * Default value assignment and enum value assignment must occur on the same line. * Operation input, output, and errors is now specified more granularly and must be defined in this specific order. * Operation input, output, and error definition must be followed by a line break. * Assigning a default value or an enum value must be followed by a line break.

Members of list, map, union, and structure are now explicitly modeled rather than relying on a generic grammar for all members. Two new constraints: 1. Map keys, if defined, must be defined before map values. 2. Members that are not elided must be on the same line. `foo:\nBar` is no longer valid.

For some reason Windows tests are failing with anything to do with newlines, event though we attempt to normalize line endings. To unblock us, I just removed anything that caused line ending issues for now.

There are some teams within Amazon that used operation properties in an order other than input, output, errors (weird!). Rather than break them, we'll just allow any order in the grammar and parser and use post-parsing logic to ensure no duplicates.

This doesn't impact much, but would allow the reverse crawling of neighbors if given just this relationship.

The clearPendingDocs call was made after consuming traits that follow a previous enum member. Fixed by calling it in the right order.

To avoid breaking existing models, require whitespace after operation properties rather than a line break.

The box trait is no longer supported in IDL 2.0, so it was removed from the Smithy 2.0 prelude model. However, tooling that is built for Smithy 1.0 might look at shapes targeted by members for the box trait, and their logic would change because the box trait was removed from common shapes like Integer, Boolean, etc. To maintain backward compatibility, this commit adds a box trait to these prelude shapes using the prelude-1.0.smithy model that was previously only used to define the `@box` trait for 1.0 models. It now uses the 1.0 model version so that it can apply the box trait to prelude shapes. This change removes the need for some custom logic in ModelUpgrader too.

Let's keep NullableIndex as-is rather than change it's default behavior to look for v2 semantics. This ensures that no existing code will change behavior unless they want it to. The isNullable method is now marked as deprecated and continues to use v1 semantics, while the other newly introduced methods use v2 semantics. Added various test cases to ensure v1 and v2 compat.

If we don't know the version of a shape (like if a Model or a bunch of pre-built shapes are added to an assembler), then we should not attempt to upgrade them. If we do attempt to upgrade them, then we cause issues with things like projections in smithy-build and CLI, where we build a model, then pass that built model into each project, thereby causing the version to be unknown (it's detached from the original source file that contained the version information). If we do attempt to upgrade this kind of model again, then we can get into a situation where the following model: ``` $version: "2.0" namespace smithy.example structure Foo { bar: Bar } integer Bar ``` Is projected into the following, incorrect model: ``` { "smithy": "2.0", "shapes": { "smithy.example#Foo" { "type": "structure", "members": { "bar": { "target": "smithy.example#Bar", "traits": { "smithy.api#default": 0 } } } }, "smithy.example#Bar": { "type": "integer" } } } ``` Notice the incorrectly added `@default` trait. This would happen because when using Smithy 1.0 semantics, `integer Bar` has a default zero value because it isn't marked with the `@box` trait. No `@default` trait is added to the "source" model projection because at that point we know the version is 2.0. But when doing another projection, the loaded model is disconnected from the original version and would cause this issue. In addition to not doing upgrades on shapes with an unknown version, this change also stops adding synthetic box traits to members. That's no longer needed because we now keep a synthetic box trait on prelude shapes. When we removed those traits previously, it meant it was impossible to determine the 1.0 nullability of a shape unless we add the synthetic box trait to the member. Given this change, we no longer need to validate the box trait in the ModelUpgrade and can rather do the validation in the Version enum directly.

We actually do need synthetic box traits on members so that tooling that attempts to work with 1.0 and 2.0 models can accurately know the intended nullability semantics of a structure member after it has been loaded into memory. This is particularly important for structure members marked as required that target a prelude shape like Integer. Without a synthetic box trait on the member, tooling can't differentiate between a member that was considered nullable in v1 or non-nullable in v2 (the box trait on the prelude shape would be honored in v1 but discarded in v2). The trait on the member itself makes the 1.0 nullability explicit. So explicit in fact that we can even check for this trait and use it in the NullableIndex methods that understand 2.0 semantics. The box trait is only allowed in 1.0 models, so its present doesn't interfere with IDL 2.0 nullability semantics. We will still keep the IDL 1.0 isNullable method around because it will work even with models that weren't loaded through an assembler (such models skip the model upgrade process and have no synthetic box trait).

...urces/software/amazon/smithy/model/loader/invalid/metadata/not-newline-after-metadata.smithy

smithy-model/src/main/resources/software/amazon/smithy/model/loader/prelude.smithy

smithy-model/src/main/java/software/amazon/smithy/model/loader/MetadataContainer.java

* Fix IDL typo for "messag" * Add missing javadoc param * Fix selector ABNF tokens

mtdowling requested a review from a team as a code owner July 27, 2022 19:35

mtdowling commented Jul 27, 2022

View reviewed changes

mtdowling force-pushed the idl-2-loader branch 4 times, most recently from 170a783 to 9247e93 Compare July 30, 2022 03:15

mtdowling changed the title ~~Simplify model loading~~ Rewrite IDL v2 model loading Jul 30, 2022

mtdowling force-pushed the idl-2-loader branch 2 times, most recently from 0011422 to f209810 Compare August 2, 2022 22:19

mtdowling force-pushed the idl-2-loader branch from f209810 to f127adb Compare August 3, 2022 04:21

mtdowling added 4 commits August 3, 2022 09:26

Fix edge case in model upgrader

1765d12

syall mentioned this pull request Aug 4, 2022

Add validation to validator and suppression ids #1326

Closed

mtdowling added 2 commits August 4, 2022 15:36

Emit no rel from enum/intEnum to Unit

4258444

Relax apply statement grammar to allow ws before trait

9c5fbc7

mtdowling force-pushed the idl-2-loader branch from 92927a8 to 9c5fbc7 Compare August 4, 2022 23:53

mtdowling added 9 commits August 4, 2022 16:57

Avoid newline issues with Windows

a6407b1

For some reason Windows tests are failing with anything to do with newlines, event though we attempt to normalize line endings. To unblock us, I just removed anything that caused line ending issues for now.

Allow any operation property order

4706ce3

There are some teams within Amazon that used operation properties in an order other than input, output, errors (weird!). Rather than break them, we'll just allow any order in the grammar and parser and use post-parsing logic to ensure no duplicates.

Emit rels for MEMBER_CONTAINER for enum members

51ca2da

This doesn't impact much, but would allow the reverse crawling of neighbors if given just this relationship.

Fix enum member trait parsing

2c5e3fd

The clearPendingDocs call was made after consuming traits that follow a previous enum member. Fixed by calling it in the right order.

Use required WS after operation properties

f31234d

To avoid breaking existing models, require whitespace after operation properties rather than a line break.

Add CLIENT_CAREFUL mode to NullableIndex

a4c0764

mtdowling force-pushed the idl-2-loader branch from 87e6cf4 to 690a27b Compare August 7, 2022 02:03

mtdowling added 3 commits August 6, 2022 23:09

Reduce model upgrade checks to only struct members

6d72903

Update selector ABNF to use CamelCase

2841778

srchase reviewed Aug 9, 2022

View reviewed changes

mtdowling requested a review from srchase August 9, 2022 18:42

srchase approved these changes Aug 9, 2022

View reviewed changes

Make minor typo and doc fixes

faa1bd4

* Fix IDL typo for "messag" * Add missing javadoc param * Fix selector ABNF tokens

mtdowling force-pushed the idl-2-loader branch from 6c7ac7d to faa1bd4 Compare August 9, 2022 20:30

srchase approved these changes Aug 9, 2022

View reviewed changes

mtdowling merged commit 700141c into idl-2.0 Aug 9, 2022

mtdowling deleted the idl-2-loader branch August 10, 2022 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite IDL v2 model loading #1317

Rewrite IDL v2 model loading #1317

mtdowling commented Jul 27, 2022 •

edited by srchase

Loading

mtdowling Jul 27, 2022

mtdowling commented Aug 2, 2022

Rewrite IDL v2 model loading #1317

Rewrite IDL v2 model loading #1317

Conversation

mtdowling commented Jul 27, 2022 • edited by srchase Loading

mtdowling Jul 27, 2022

Choose a reason for hiding this comment

mtdowling commented Aug 2, 2022

mtdowling commented Jul 27, 2022 •

edited by srchase

Loading