-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
schemas: Break down schema generation #1071
Conversation
92436e0
to
1809cbf
Compare
845f298
to
c72cc2e
Compare
a987815
to
22f1564
Compare
ba9e389
to
c0a1d67
Compare
This avoids the retries of init in most cases which would be caused by capacity (network, CPU, memory) starvation
c0a1d67
to
80a4916
Compare
vsfsgen used GZIP compression by default but we abandoned it in favour of the stdlib embed package in #1070. This alone resulted in 80M binary size (compared to 18M before). Breaking down schemas to individual files would further increase binary size to 160M. In either case this is well over the limit of what we can pack into VSIX (VS Code extension) - currently 30MB. Using gzip compression can bring the size to 19M. Mentioned sizes reflect darwin/arm64, but differences on other platforms should be similar. Crucially, the compression doesn't seem to affect the time to load the (now compressed) file into memory. This remains between <1ms and 130ms.
2037051
to
48f92ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Just a minor thing in the retry logic.
Displaying some kind of summary after running go generate ...
would be nice, but we can add something later.
Tier | Count | Embedded | Errors |
---------|-------|-----------|--------|
official | 34 | 34 (100%) | 0 |
partner | 230 | 200 (87%) | 30 |
---------|-------|-----------|--------|
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Closes #990
This addresses a few different problems, as originally outlined in the linked issue.
Runtime Memory Usage
As tested on macOS (M1 Pro)
Previously (v0.29.2) empty config: ~572 MB
Previously (v0.29.2) config with
hashicorp/random
: ~572 MBPreviously (v0.29.2) config with
hashicorp/aws
: ~572 MBAfter (this PR) empty config: ~9.8 MB
After (this PR) config with
hashicorp/random
: ~12 MBAfter (this PR) config with
hashicorp/aws
: ~70 MBLaunch Time
Previously (v0.29.2)
initialize
request/response time: ~2 sAfter (this PR)
initialize
request/response time: 1-3 msThis is a result of lazily loading each provider only when it's necessary (when we encounter relevant provider requirement). The cost of "lazy-loading" a provider at runtime can be anywhere between <1ms (e.g.
hashicorp/random
) and ~130ms (e.g.hashicorp/aws
) which should still be fast enough for the user to not notice a difference.Schema Generation
Time to generate schemas
As measured in the GitHub Actions CI (Ubuntu latest)
Previously (recent CI run) 11-12 minutes
After (this PR) 3 minutes
This is mostly a result of running
init
for each provider individually and doing so in parallel. Terraform itself does not parallelise provider installation as part ofterraform init
.Platform-Specific Provider Ignorelist
The schema generation, which runs as part of CI for every release and PR and can also run locally, is subject to OS/arch requirements. Previously we maintained a long list of providers which were known to be unavailable for certain platforms, just to make generation work on these platforms. Such a list may easily get outdated the moment it is committed as the Registry is the only source of truth (i.e. providers may release compatible artifacts the next day after we put it on the ignorelist).
This PR addresses the problem by breaking down the generation, such that we don't run
init
for a single giant config with all providers, but we first poke the Registry API and filter out any providers which we know are not available for the platform where we're running.Broken Provider Installations
In order to obtain schema for each provider, we first have to install it. The process of installation can break at any time for a few reasons, of which the most commons ones are:
This PR addresses most of these problems by retrying
init
for each provider individually (5 times, with 2 seconds backoff in between).Additionally we also now treat even the final failures (after 5 unsuccessful attempts) as soft failures, which just cause schema for the provider to not be included, but the CI to succeed and schema embedding to still work. There is a downside to this: There could be a widespread temporary outage causing us to skip all providers at release time and releasing LS which has no embedded schemas. This could be mitigated by careful observation and potential follow-up release when outage is resolved.
We could also add some checks to verify that e.g. 80-90% of providers were installed correctly, but I'd prefer to leave this for another PR.