Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite IDL v2 model loading #1317

Merged
merged 20 commits into from
Aug 9, 2022
Merged

Rewrite IDL v2 model loading #1317

merged 20 commits into from
Aug 9, 2022

Conversation

mtdowling
Copy link
Member

@mtdowling mtdowling commented Jul 27, 2022

Model loader is now done through a central Consumer that
receives "operations" from model files. These operations
are things like "addShape", "applyTrait", "addForwardReference",
etc. The consumer processes these events and then coordinates
the traits loaded in the model, shapes added, and how they
are all assembled together.

This approach is able to maintain shapes in-tact when they are
added to a ModelAssembler up until the point in which the shape
might be modified by an applyTrait statement or if another trait
is added that conflicts with it. This should improve performance
and reduce memory usage when building projections with SmithyBuild
since it will cut down on having to deconstruct and the reconstruct
previously built shapes.

Other things to note:

  • Ensure a version is emitted from the IDL even if one is not
    set so that the assembler can know to upgrade 1.0 shapes.
  • Add an implicit @box trait to members that were nullable in
    1.0 so that tooling can know if the member was considered
    nullable in 1.0 semantics (and this check ignores the
    @required trait to meet 1.0 expectations).
  • Now that there are more synthetic traits, the Upgrade12Command
    needs to not try to rewrite and erase deprecated synthetic
    traits since they don't actually appear in the model. No
    new test case was needed.
  • Ensure that the @default trait is not allowed in 1.0 models.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@mtdowling mtdowling requested a review from a team as a code owner July 27, 2022 19:35
@@ -57,7 +65,14 @@ public void enqueue(Shape shape) {
* @param dependencies Dependencies of the shape.
*/
public void enqueue(ShapeId shape, Collection<ShapeId> dependencies) {
forwardDependencies.put(shape, new LinkedHashSet<>(dependencies));
if (dependencies.isEmpty()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized we can jump the line and just push straight into satisfiedShapes as an optimization

@mtdowling mtdowling force-pushed the idl-2-loader branch 4 times, most recently from 170a783 to 9247e93 Compare July 30, 2022 03:15
@mtdowling mtdowling changed the title Simplify model loading Rewrite IDL v2 model loading Jul 30, 2022
@mtdowling mtdowling force-pushed the idl-2-loader branch 2 times, most recently from 0011422 to f209810 Compare August 2, 2022 22:19
@mtdowling
Copy link
Member Author

Rebased off of latest idl-2.0 branch changes

Model loader is now done through a central Consumer that
receives "operations" from model files. These operations
are things like "addShape", "applyTrait", "addForwardReference",
etc. The consumer processes these events and then coordinates
the traits loaded in the model, shapes added, and how they
are all assembled together.

This approach is able to maintain shapes in-tact when they are
added to a ModelAssembler up until the point in which the shape
might be modified by an applyTrait statement or if another trait
is added that conflicts with it. This should improve performance
and reduce memory usage when building projections with SmithyBuild
since it will cut down on have to deconstruct and the reconstruct
previously built shapes.

Other things to note:

* Ensure a version is emitted from the IDL even if one if not
  set so that the assembler can know to upgrade 1.0 shapes.
* Add an implicit @box trait to members that were nullable in
  1.0 so that tooling can know if the member was considered
  nullable in 1.0 semantics (and this check ignores the
  @required trait to meet 1.0 expectations).
* Now that there are more synthetic traits, the Upgrade12Command
  needs to not try to rewrite and erase deprecated synthetic
  traits since they don't actually appear in the model. No
  new test case was needed.
* Ensure that the @default trait is not allowed in 1.0 models.
* Ignore invalid doc comments within shapes
* Ignore 1.0 deprecation warnings in test runners

  The Smithy test runner no longer requires 1.0 deprecations
  or trait deprecations to be listed explicitly. This allows
  existing model test suites to contine to run without needing
  to be updated. It also allows trait vendors to deprecated a
  trait without breaking consumers.

  To accommodate this, I modified all warnings about 1.0 models to
  use a "ModelDeprecation" event ID, including set deprecation.
We previously rewrote models that target smithy.api#Primtive*
shapes to target their prelude counterparts that do not have the
Primitive prefix. For example, smithy.api#PrimitiveInteger would
become smithy.api#Integer. The rationale being that because optionality
is now controlled on members rather than based on the shape targeted by
a member, there is no longer a reason to use Primitive* shapes in 2.0
models.

The problem is that rewriting the shape targets during the loading
process can remove information that tooling might need to determine if
a shape was considered optional in IDL 1.0 (usually in fringe cases).

If a member previously targets PrimitiveInteger and is rewritten to
target Integer, but the member is also marked as required, then the
ModelUpgrader can't add the default trait to the member. If optionality
isn't determined based on the required trait, then a tool has no way
to know that a member that targets Integer previously targeted
PrimitiveInteger, thereby losing information.

Instead of removing Primitive* shapes, we will keep them in 2.0 and
mark them as deprecated. To point this out to end users, I added
validation (that really should have already existed) to detect when
a shape refers to a deprecated shape. However, because we don't want
to break previous test cases and because models should be able to
deprecate shapes without breaking test cases, DeprecatedShape does not
need to be explicitly handled by Smithy errorfile test runners. This
matches the special casing added for DeprecatedTrait and
ModelDeprecation.
This commit tightens the grammar of the IDL to be a bit more
restrictive and less ambiguous. Of note:

* Several rules were updated to require spaces between productions.
  For example `strutureFoo{}` is no longer technically valid.
  `usesmithy.api#Foo` is no longer valid.
* control statements must be on a single line. For example,
  `$version:\n"1.0"` is no longer valid. Control statements must
  also be followed by a line break.
* metadata statements must be on a single line. `metadata foo\n="bar"`
  is no longer valid. Metadata statements must be followed by a
  line break.
* use statements must be on a single line too and followed by a
  line break.
* Specify that newlines are allowed in strings
* Updated grammar rules to CamelCase to be compatible with ABNF
* Adopted RFC 7405 to specify case sensitive strings
* Default value assignment and enum value assignment must occur
  on the same line.
* Operation input, output, and errors is now specified more
  granularly and must be defined in this specific order.
* Operation input, output, and error definition must be followed by
  a line break.
* Assigning a default value or an enum value must be followed by a line
  break.
Members of list, map, union, and structure are now explicitly
modeled rather than relying on a generic grammar for all members.

Two new constraints:
1. Map keys, if defined, must be defined before map values.
2. Members that are not elided must be on the same line.
   `foo:\nBar` is no longer valid.
For some reason Windows tests are failing with anything to do with newlines,
event though we attempt to normalize line endings. To unblock us, I just
removed anything that caused line ending issues for now.
There are some teams within Amazon that used operation properties in an
order other than input, output, errors (weird!). Rather than break them,
we'll just allow any order in the grammar and parser and use post-parsing
logic to ensure no duplicates.
This doesn't impact much, but would allow the reverse crawling of
neighbors if given just this relationship.
The clearPendingDocs call was made after consuming traits that follow
a previous enum member. Fixed by calling it in the right order.
To avoid breaking existing models, require whitespace after operation
properties rather than a line break.
The box trait is no longer supported in IDL 2.0, so it was removed from
the Smithy 2.0 prelude model. However, tooling that is built for Smithy
1.0 might look at shapes targeted by members for the box trait, and their
logic would change because the box trait was removed from common shapes
like Integer, Boolean, etc. To maintain backward compatibility, this
commit adds a box trait to these prelude shapes using the
prelude-1.0.smithy model that was previously only used to define
the `@box` trait for 1.0 models. It now uses the 1.0 model version so
that it can apply the box trait to prelude shapes. This change removes
the need for some custom logic in ModelUpgrader too.
Let's keep NullableIndex as-is rather than change it's default behavior
to look for v2 semantics. This ensures that no existing code will
change behavior unless they want it to. The isNullable method is
now marked as deprecated and continues to use v1 semantics, while
the other newly introduced methods use v2 semantics. Added various
test cases to ensure v1 and v2 compat.
If we don't know the version of a shape (like if a Model
or a bunch of pre-built shapes are added to an assembler),
then we should not attempt to upgrade them. If we do attempt
to upgrade them, then we cause issues with things like
projections in smithy-build and CLI, where we build a model,
then pass that built model into each project, thereby causing
the version to be unknown (it's detached from the original
source file that contained the version information). If we
do attempt to upgrade this kind of model again, then we can
get into a situation where the following model:

```
$version: "2.0"
namespace smithy.example

structure Foo {
    bar: Bar
}

integer Bar
```

Is projected into the following, incorrect model:

```
{
    "smithy": "2.0",
    "shapes": {
        "smithy.example#Foo" {
            "type": "structure",
            "members": {
                "bar": {
                    "target": "smithy.example#Bar",
                    "traits": {
                        "smithy.api#default": 0
                    }
                }
            }
        },
        "smithy.example#Bar": {
            "type": "integer"
        }
    }
}
```

Notice the incorrectly added `@default` trait. This would happen because
when using Smithy 1.0 semantics, `integer Bar` has a default zero value
because it isn't marked with the `@box` trait. No `@default` trait is added
to the "source" model projection because at that point we know the version
is 2.0. But when doing another projection, the loaded model is disconnected
from the original version and would cause this issue.

In addition to not doing upgrades on shapes with an unknown version, this
change also stops adding synthetic box traits to members. That's no longer
needed because we now keep a synthetic box trait on prelude shapes. When
we removed those traits previously, it meant it was impossible to determine
the 1.0 nullability of a shape unless we add the synthetic box trait to the
member. Given this change, we no longer need to validate the box trait
in the ModelUpgrade and can rather do the validation in the Version enum
directly.
We actually do need synthetic box traits on members so that tooling that
attempts to work with 1.0 and 2.0 models can accurately know the intended
nullability semantics of a structure member after it has been loaded into
memory. This is particularly important for structure members marked as
required that target a prelude shape like Integer. Without a synthetic
box trait on the member, tooling can't differentiate between a member that
was considered nullable in v1 or non-nullable in v2 (the box trait on the
prelude shape would be honored in v1 but discarded in v2). The trait on
the member itself makes the 1.0 nullability explicit. So explicit in fact
that we can even check for this trait and use it in the NullableIndex
methods that understand 2.0 semantics. The box trait is only allowed in
1.0 models, so its present doesn't interfere with IDL 2.0 nullability
semantics. We will still keep the IDL 1.0 isNullable method around because
it will work even with models that weren't loaded through an assembler
(such models skip the model upgrade process and have no synthetic box trait).
@mtdowling mtdowling requested a review from srchase August 9, 2022 18:42
* Fix IDL typo for "messag"
* Add missing javadoc param
* Fix selector ABNF tokens
@mtdowling mtdowling merged commit 700141c into idl-2.0 Aug 9, 2022
@mtdowling mtdowling deleted the idl-2-loader branch August 10, 2022 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants