Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rosetta): reuse translation results from a cache #3101

Merged
merged 17 commits into from
Nov 2, 2021

Conversation

rix0rrr
Copy link
Contributor

@rix0rrr rix0rrr commented Oct 27, 2021

By fingerprinting types in an assembly and storing the fingerprints along with translations, we can reuse existing translations if we know nothing changed.

A snippet needs to be recompiled if:

  • Its source text changes: hash key will be different
  • Its fixture changes: fullSource will be different
  • The referenced types have changed: fingerprint will be different
  • The translator's capabilities have changed: translation version will be different

Since we have many duplicate snippets (especially cluster.enableSingleUserRotation() crops up a lot 🙂), I've had to encode the source location for the snippet into the key. Since that means that location encoding is now significant and needs to be correct, some refactoring around encoding snippet locations has become important.

Incidentally, this will also fix broken translations originating from monocdk from overwriting perfectly cromulent translations in the @aws-cdk packages.

Refactorings in this PR:

  • Snippet source locations are no longer a semi-free form string -- they are now a structured data type that captures the location accurately (and incidentally gets rendered to a filename-like string for interaction with the TS compiler and human display).
  • Snippet source locations now need to be passed in every place where some code is passed to Rosetta. These take the form of { api: 'member', fqn: '@ns/package.type', memberName: 'bucketName' } (an ApiLocation), and have been added to many, many places, including jsii-pacmak.
  • Translators now have a VERSION, which is used to invalidate cached translations if we change something about the implementation of the translator.
  • Existing tablet files may now need to span multiple jsii release versions, so they are now validated based on their schema version instead of the version of the tool that produced them.
  • TranslatedSnippet had a copy of all the data in a TranslatedSnippetSchema, and was copying back and forth. Change it to just hold on to a TranslatedSnippet internally. It now represents the operations that can be done to a TranslatedSnippetSchema.
  • The translation of ts.Diagnostic to RosettaDiagnostic is now done even earlier (closer to the code that deals with the TS compiler).
  • Snippets are no longer in an IterableIterator, but just an plain old array.
  • Snippets in Markdown are no longer identified by their snippet counter, but by their line number.
  • Mutable fields turned into readonly fields on a number of structs.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@rix0rrr rix0rrr self-assigned this Oct 27, 2021
@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Oct 27, 2021
@rix0rrr rix0rrr force-pushed the huijbers/rosetta-fingerprinting branch from 28d29fc to 5399c71 Compare October 27, 2021 16:25
@rix0rrr rix0rrr changed the title feat(rosetta): reuse compilation results from a cache feat(rosetta): reuse translation results from a cache Oct 27, 2021
…long with translations, we can reuse existing translations if we know nothing changed.

A snippet needs to be recompiled if:

- Its source text changes: hash will be different
- Its fixture changes: fullSource will be different
- The referenced types have changed: fingerprint will be different
…iagnostic down more in favor of RosettaDiagnostic
@rix0rrr rix0rrr force-pushed the huijbers/rosetta-fingerprinting branch from 23747dc to 6df1bfe Compare November 1, 2021 10:20
@rix0rrr rix0rrr force-pushed the huijbers/rosetta-fingerprinting branch from 6df1bfe to 2901b56 Compare November 1, 2021 13:00
@mergify
Copy link
Contributor

mergify bot commented Nov 2, 2021

The title of this Pull Request does not conform with [Conventional Commits] guidelines. It will need to be adjusted before the PR can be merged.
[Conventional Commits]: https://www.conventionalcommits.org

@rix0rrr rix0rrr marked this pull request as ready for review November 2, 2021 12:50
@rix0rrr rix0rrr requested a review from a team November 2, 2021 12:50
Copy link
Contributor

@njlynch njlynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, my eyes did glaze over a bit here at one point, but I think I was paying attention for all the salient points.

🤞

}

/**
* Translate all snippets
*
* We are now always using workers, as we are targeting Node 12+.
*/
async function translateAll(
snippets: IterableIterator<TypeScriptSnippet>,
export async function translateAll(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just exported for testing? Might be worth a comment if so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was originally, no longer needs to be. Good catch.

* Bump this when you change something in the implementation to invalidate
* existing cached translations.
*/
public static readonly VERSION = '1';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food for (future) thought -- this feels very much like a good intentions thing. Could we mechanize this a bit better somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. Not sure!

public get didCompile() {
return this._didCompile;
}
public readonly snippet: TranslatedSnippetSchema;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~ The naming here is a bit odd. TranslatedSnippet.snippet, which returns a TranslatedSnippetSchema, which reads to me like the snippet itself. It leads to a lot of snippet.snippet in the code, which is never great...

It's almost like TranslatedSnippet should implement TranslatedSnippetSchema. Am I off there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the names would be better to say TranslatedSnippetOperations and TranslatedSnippet. The class represents the operations on the pure-data object.

But I do need the ability to get the pure-data object out in a lightweight way.

@mergify
Copy link
Contributor

mergify bot commented Nov 2, 2021

Thank you for contributing! ❤️ I will now look into making sure the PR is up-to-date, then proceed to try and merge it!

@mergify mergify bot added the pr/ready-to-merge This PR is ready to be merged. label Nov 2, 2021
@mergify
Copy link
Contributor

mergify bot commented Nov 2, 2021

Merging (with squash)...

@mergify
Copy link
Contributor

mergify bot commented Nov 2, 2021

Merging (with squash)...

@mergify mergify bot merged commit b08a50e into main Nov 2, 2021
@mergify mergify bot deleted the huijbers/rosetta-fingerprinting branch November 2, 2021 17:09
@mergify
Copy link
Contributor

mergify bot commented Nov 2, 2021

Merging (with squash)...

@mergify mergify bot removed the pr/ready-to-merge This PR is ready to be merged. label Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution/core This is a PR that came from AWS.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants