Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schemas: Break down schema generation #1071

Merged
merged 14 commits into from
Oct 4, 2022
Merged

Conversation

radeksimko
Copy link
Member

@radeksimko radeksimko commented Sep 13, 2022

Closes #990

This addresses a few different problems, as originally outlined in the linked issue.

Runtime Memory Usage

As tested on macOS (M1 Pro)

Previously (v0.29.2) empty config: ~572 MB
Previously (v0.29.2) config with hashicorp/random: ~572 MB
Previously (v0.29.2) config with hashicorp/aws: ~572 MB

After (this PR) empty config: ~9.8 MB
After (this PR) config with hashicorp/random: ~12 MB
After (this PR) config with hashicorp/aws: ~70 MB

Launch Time

Previously (v0.29.2) initialize request/response time: ~2 s
After (this PR) initialize request/response time: 1-3 ms

This is a result of lazily loading each provider only when it's necessary (when we encounter relevant provider requirement). The cost of "lazy-loading" a provider at runtime can be anywhere between <1ms (e.g. hashicorp/random) and ~130ms (e.g. hashicorp/aws) which should still be fast enough for the user to not notice a difference.

Schema Generation

Time to generate schemas

As measured in the GitHub Actions CI (Ubuntu latest)

Previously (recent CI run) 11-12 minutes
After (this PR) 3 minutes

This is mostly a result of running init for each provider individually and doing so in parallel. Terraform itself does not parallelise provider installation as part of terraform init.

Platform-Specific Provider Ignorelist

The schema generation, which runs as part of CI for every release and PR and can also run locally, is subject to OS/arch requirements. Previously we maintained a long list of providers which were known to be unavailable for certain platforms, just to make generation work on these platforms. Such a list may easily get outdated the moment it is committed as the Registry is the only source of truth (i.e. providers may release compatible artifacts the next day after we put it on the ignorelist).

This PR addresses the problem by breaking down the generation, such that we don't run init for a single giant config with all providers, but we first poke the Registry API and filter out any providers which we know are not available for the platform where we're running.

Broken Provider Installations

In order to obtain schema for each provider, we first have to install it. The process of installation can break at any time for a few reasons, of which the most commons ones are:

  • Provider maintainers manually yank or change artifacts or break release process in some other way
  • GitHub outage
  • Issues on the network path between GitHub Actions environment and GitHub Releases hosting the artifacts
  • Terraform Registry outage

This PR addresses most of these problems by retrying init for each provider individually (5 times, with 2 seconds backoff in between).

Additionally we also now treat even the final failures (after 5 unsuccessful attempts) as soft failures, which just cause schema for the provider to not be included, but the CI to succeed and schema embedding to still work. There is a downside to this: There could be a widespread temporary outage causing us to skip all providers at release time and releasing LS which has no embedded schemas. This could be mitigated by careful observation and potential follow-up release when outage is resolved.

We could also add some checks to verify that e.g. 80-90% of providers were installed correctly, but I'd prefer to leave this for another PR.

@radeksimko radeksimko added the enhancement New feature or request label Sep 13, 2022
@radeksimko radeksimko self-assigned this Sep 13, 2022
@radeksimko radeksimko force-pushed the f-schemagen-breakdown branch 11 times, most recently from 92436e0 to 1809cbf Compare September 22, 2022 13:13
@radeksimko radeksimko force-pushed the f-schemagen-breakdown branch 7 times, most recently from 845f298 to c72cc2e Compare September 26, 2022 12:52
@radeksimko radeksimko force-pushed the f-schemagen-breakdown branch 8 times, most recently from a987815 to 22f1564 Compare September 26, 2022 17:08
@radeksimko radeksimko force-pushed the f-schemagen-breakdown branch 11 times, most recently from ba9e389 to c0a1d67 Compare October 3, 2022 13:33
@radeksimko radeksimko force-pushed the f-schemagen-breakdown branch from c0a1d67 to 80a4916 Compare October 3, 2022 15:56
vsfsgen used GZIP compression by default but we abandoned it in favour of the stdlib embed package in #1070. This alone resulted in 80M binary size (compared to 18M before). Breaking down schemas to individual files would further increase binary size to 160M. In either case this is well over the limit of what we can pack into VSIX (VS Code extension) - currently 30MB.

Using gzip compression can bring the size to 19M. Mentioned sizes reflect darwin/arm64, but differences on other platforms should be similar.

Crucially, the compression doesn't seem to affect the time to load the (now compressed) file into memory.
This remains between <1ms and 130ms.
@radeksimko radeksimko force-pushed the f-schemagen-breakdown branch from 2037051 to 48f92ca Compare October 4, 2022 13:10
Copy link
Member

@dbanck dbanck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Just a minor thing in the retry logic.

Displaying some kind of summary after running go generate ... would be nice, but we can add something later.

Tier     | Count |  Embedded | Errors |
---------|-------|-----------|--------|
official |    34 | 34 (100%) |      0 |
partner  |   230 | 200 (87%) |     30 |
---------|-------|-----------|--------|

internal/schemas/gen/gen.go Show resolved Hide resolved
internal/schemas/gen/gen.go Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented Nov 4, 2022

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lazy-load embedded provider schemas to improve performance
3 participants