Refactor providers.Schemas and add a global schema cache #33482

jbardin · 2023-07-06T15:23:53Z

The primary goal of this PR is a structural refactoring of the types used to store provider schemas. Previously there was the deprecated *terraform.ProviderSchema, which was an alias for providers.Schemas, and providers.GetProviderSchemaResponse which contained the same data but was structurally distinct from providers.Schemas. All of these were equally used throughout the codebase, including tests, so unfortunately there was no way to get to a unified the type without a large diff.

The result here is that we now have a single type for all provider schema representations: providers.ProviderSchema. This happens to be an alias for the internal rpc response type providers.GetProviderSchemaResponse, which includes all schema data including request diagnostics. While the diagnostics are not applicable in most situations, multiple concurrent calls to the provider each need to access the schema and need to handle the error path, so the cache must include the diagnostics for use within the provider client. The ProviderSchema type does also use the Schema struct for all schemas which happens to have a Version field not applicable in all cases, however keeping the type consistent and reusable without messy conversions outweighs the extra cruft in the type signature while leaving room open for extension.

We also lift the caching layer for schemas from core to providers.SchemaCache as a global store of provider's schemas. The second copy of the schemas in core was redundant, and the goal of the global cache is to reduce memory footprint and allow multiple instances of providers to share the previously loaded schema. While the cache is transparently used through the provider, we still have core check the cache when looking up a schema, because we may not even need to load a provider instance at all in some cases.

Writing to the cache is temporarily disabled, because some releases of providers require that GetProviderSchema is called first to initialize the provider, and a forthcoming ProviderCapability flag will be set by those that can be effectively cached. Even though there will be providers that cannot be globally cached, there is still no need for the old core caching layer, as it didn't prevent the GetProviderSchema call on every provider instance, and if it did work it would have broken those providers which require that as the first call (these providers' clients will still cache the schema themselves per instance).

Add a single global schema cache for providers. This allows multiple provider instances to share a single copy of the schema, and prevents loading the schema multiple times for a given provider type during a single command. This does not currently work with some provider releases, which are using GetProviderSchema to trigger certain initializations. A new server capability will be introduced to trigger reloading their schemas, but not store duplicate results.

Unify the struct used for Schemes and GetProviderSchemaResponse so that we can have a single cache which handles all schema access.

Use the global providers.SchemaCache and update all schema access to the providers.Schemas, except where the provider.GetProviderSchemaResponse type name would be expected. Some tests that reuse provider factories needed a little more careful handling. Change the fixed func to only reset the provider on the first call.

jbardin · 2023-07-06T15:31:58Z

internal/terraform/terraform_test.go

-			p.ReadDataSourceCalled = false
-			p.CloseCalled = false
-		}
+	if p, ok := rp.(*MockProvider); ok {


This has always been a very delicate and unreliable aspect of the mock provider. Resetting these on each factory call assumed that the same instance was used for an entire operation which isn't always true, and the new external caching makes that all the more apparent. We settle on making the provider instance reusable, but requiring a new closure for each context if you want to use these fields.

apparentlymart

This is of course too large for a thorough review, but I tried to pay closest attention to the specifics you mentioned in the PR description and those parts all seemed reasonable to me. I think the best way to identify any unexpected implications of this is to merge it into main and let it soak in our alpha releases during the v1.6 development period.

I was initially a little nervous about using GetProviderSchemaResponse as the main representation given that it's quite tightly coupled to the current wire protocol -- e.g. if we decided to switch to a different approach of requesting individual resource types only when needed rather than fetching everything all at once then GetProviderSchemaResponse might cease to exist altogether. But on reflection it seems like we'd still want to have somewhere to flatten all of that down into a single type, and the fact that you used a type alias elsewhere rather than mentioning GetProviderSchemaResponse directly means that we could just rename type GetProviderSchemaResponse to type ProviderSchemas if later we have more separation between the protocol representation and the internal representation.

My only remaining feedback about that, then, is that I'd like to see a common on type GetProviderSchemaResponse to make it explicit that this name should be used only for the type in the signature of the providers.Interface method, and that type ProviderSchemas should be used everywhere else. (I might even consider swapping it so that GetProviderSchemaResponse is the alias, to make it more likely that a new maintainer will just "accidentally" find the right name to use, but I'll leave that up to you!)

jbardin · 2023-07-06T19:45:35Z

Yes, I used the GetProviderSchemaResponse alias kind of as a compromise to reduce even more churn (because a huge amount of test code creates that structure directly), but once I started implementing it, it didn't feel like there was much of a reason to try and break that apart at all now. There is a still a bunch of test code which does leverage the fact that these are type aliases, but that would need to be refactored regardless of what approach we use, so there's no benefit to attacking it all at once, and the type system will catch all the cases for us if we ever separate the types. I'll add another comment better specifying the intentions of each type name though.

github-actions · 2023-07-06T20:06:45Z

Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch.

Reference: #33482 The intention of this logic is to use the global provider schema cache if it is available for a given provider address based on the `GetProviderSchemaOptional` server capability. Previously the logic was errantly checking the named return early, which was not populated yet with the server response. This flips the logic to properly use the server response to determine whether the cache should be set, thereby still gating the caching based on the server capability. In a configuration with 19 `hashicorp/aws` provider instances and an associated data source, this drops the received `GetProviderSchema` calls from 39(?) to 1. Maximum RSS dropped from 1.22 GB to 189 MB.

github-actions · 2023-12-13T02:09:38Z

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

jbardin added 10 commits July 6, 2023 10:37

remove unused factory functions

d199d42

use the same struct for all schema access

9d97465

Unify the struct used for Schemes and GetProviderSchemaResponse so that we can have a single cache which handles all schema access.

update jsonprovider to use new schema type

d487ce2

update schemas type

4e8dd58

replace remaining types in tests

36d3413

s/Schemas/ProviderSchema/

cf5ebaf

disable cache for now

68d86e8

remove fixed FIXME

b9d8d99

jbardin commented Jul 6, 2023

View reviewed changes

jbardin marked this pull request as draft July 6, 2023 15:33

fix locking and race in test mocks

76332db

vercel bot deployed to Preview July 6, 2023 15:49 View deployment

jbardin marked this pull request as ready for review July 6, 2023 15:53

jbardin requested a review from a team July 6, 2023 15:54

apparentlymart approved these changes Jul 6, 2023

View reviewed changes

additional schema handling comment

a77baa8

vercel bot deployed to Preview July 6, 2023 19:56 View deployment

jbardin merged commit 6be6f69 into main Jul 6, 2023
4 checks passed

jbardin deleted the jbardin/schema-cache branch July 6, 2023 20:06

bflad mentioned this pull request Jul 7, 2023

Introduce Protocol Server Capability for Optional GetProviderSchema RPC hashicorp/terraform-plugin-go#310

Closed

bflad mentioned this pull request Aug 30, 2023

internal/plugin+internal/plugin6: Fix provider schema caching #33768

Closed

This was referenced Oct 9, 2023

[Enhancement]: Improve memory allocation in the provider hashicorp/terraform-provider-aws#33553

Open

[Bug]: Memory consumption has increased since v4.67.0 hashicorp/terraform-provider-aws#31722

Open

YakDriver mentioned this pull request Oct 11, 2023

[Bug]: Memory testing and reporting improvements hashicorp/terraform-provider-aws#33898

Open

github-actions bot locked as resolved and limited conversation to collaborators Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor providers.Schemas and add a global schema cache #33482

Refactor providers.Schemas and add a global schema cache #33482

jbardin commented Jul 6, 2023 •

edited

Loading

jbardin Jul 6, 2023 •

edited

Loading

apparentlymart left a comment

jbardin commented Jul 6, 2023 •

edited

Loading

github-actions bot commented Jul 6, 2023

github-actions bot commented Dec 13, 2023

Refactor providers.Schemas and add a global schema cache #33482

Refactor providers.Schemas and add a global schema cache #33482

Conversation

jbardin commented Jul 6, 2023 • edited Loading

jbardin Jul 6, 2023 • edited Loading

Choose a reason for hiding this comment

apparentlymart left a comment

Choose a reason for hiding this comment

jbardin commented Jul 6, 2023 • edited Loading

github-actions bot commented Jul 6, 2023

github-actions bot commented Dec 13, 2023

jbardin commented Jul 6, 2023 •

edited

Loading

jbardin Jul 6, 2023 •

edited

Loading

jbardin commented Jul 6, 2023 •

edited

Loading