-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
states: New packages for state, state files, and state managers #18223
Conversation
We're going to need to revisit the concurrency on the state in this round. The reason for the internal state locks was that the sidecar lock used during evaluation is not sufficient to protect the state in all places where it's accessed (this also applies to the Diff). Removal of those lock values from the evaluation context was something I had planned to so, because at this point they were only a false sense of security. I have yet to review the internals in detail, but the unit tests show 42 new data races for me with the new states. IIRC the primary culprit for breaking the synchronization contracts was the interpolator, and the new races seem to confirm that now that the entry points are entirely confined to the hclsyntax package, so it may be feasible to lock around it here. My main concern here is that the pattern of having the lock separate from the structure leaves too many chances for unsynchronized access. Having a fully synchronized data structure, only accessed via mutator methods, forces all usage of the data to be synchronized. |
I've generally leaned towards "caller handles locking" for stuff like this because synchronization needs can vary by caller, and I feel like this follows the lead set by Go itself in leaving the caller to handle synchronization of access to maps and slices. If we find that more granular locking is needed then I'd still lean towards having the subsystem that causes that need deal with the additional complexity that implies, and have If we wanted to go down the road of having locking be handled internally in the state structures, I think we'd need to make all of the struct attributes unexported and force all read accesses through methods as well. I don't think that's necessarily bad, but it's a pretty big shift from how we deal with things today, and I expect we'd end up with a lot more methods in the state package to deal with multi-step operations that need to happen "atomically". I'd like to propose a compromise: rather than introducing all this additional complexity into the state models themselves, perhaps we can have an access layer that wraps a What do you think? |
That's really only for primitives, and there is plenty of precedent for complex data types that are safe for concurrent access via methods, e.g. |
Apologies if this isn't the most helpful comment, but the above both feels and smells right to me. I know that I'm saying this from the perspective of someone who doesn't know the internals, but my instinct says state is too critical to leave that responsibility for locking to a caller. |
My interpretation of the distinction has been as more between data objects vs. "actor" objects, as opposed to primitive vs. non-primitive. For example, while However, I take the point that our state object sits somewhere between the two extremes here: the higher-level portions of it are generally acted on rather than accessed, like With that in mind, I've pushed here a new type I think this is a nice compromise because it gives us the safe API we need to implement the operations during the graph walk (though we'll probably need to add some additional methods along the way once we start integrating) but still retains the convenience of working directly with the data structures in some cases, at the expense of some additional copying. For example, during an eval tree for a resource instance I expect This new wrapper can also take care of some housekeeping details that require atomicity, such as removing a module from the state automatically if an operation leaves it empty. I decided to reframe the "move current object into deposed" operation as one of these higher-level operations too, since it's making a sequence of coordinated changes with multiple objects. |
9e1e80b
to
c3b807c
Compare
states/import.go
Outdated
// expressions and is compared with configuration when producing a diff. | ||
Value cty.Value | ||
|
||
// Internal corresponds to the field of the same name on RemoteObject, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
field name out of sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh yes, thanks! I didn't end up wiring up any other stuff related to import handling yet, and so I wasn't really paying close attention to this type. I'll fix this docstring up for now, but I expect we'll want to fuss with this some more later once the interactions between this and the import operation become clearer.
|
||
const ( | ||
// ObjectReady is an object status for an object that is ready to use. | ||
ObjectReady ObjectStatus = 'R' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While having unset mean "not-ready" is logical if this were a boolean value, I feel that having 3 states describes by 2 values is a bit awkward. Maybe we should have a ObjectPending = 'P'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not intending "pending" to be represented at all yet, because we don't have any support for it elsewhere. I expected we'd add it only when we're ready to use it, since I don't think we know 100% how the behavior of that will work yet, and it won't require any change to the shape of these data structures.
Did I imply in the docs somewhere that there was supposed to be a third value in this enum right now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, nothing here was incorrect. IIRC when we discussed handling an interrupted apply, I agreed that "pending" could just as well be represented as "!ready", so I was incorrectly applying that idea here.
If you intended to add other states as needed, then ignore my first comment ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great overall, just a couple comments inline
c3b807c
to
40c8de3
Compare
Our previous state models in the "terraform" package had a few limitations that are addressed here: - Instance attributes were stored as map[string]string with dot-separated keys representing traversals through a data structure. Now that we have a full type system, it's preferable to store it as a real data structure. - The existing state structures skipped over the "resource" concept and went straight to resource instance, requiring heuristics to decide whether a particular resource should appear as a single object or as a list of objects when used in configuration expressions. - Related to the previous point, the state models also used incorrect terminology where "ResourceState" was really a resource instance state and "InstanceState" was really the state of a particular remote object associated with an instance. These new models use the correct names for each of these, introducing the idea of a "ResourceInstanceObject" as the local record of a remote object associated with an instance. This is a first pass at fleshing out a new model for state. Undoubtedly there will be further iterations of this as we work on integrating these new models into the "terraform" package. These new model types no longer serve double-duty as a description of the JSON state file format, since they are for in-memory use only. A subsequent commit will introduce a separate package that deals with persisting state to files and reloading those files later.
Whereas the parent directory "states" contains the models that represent state in memory, this package's responsibility is in serializing a subset of that data to a JSON-based file format and then reloading that data back into memory later. For reading, this package supports state file formats going back to version 1, using lightly-adapted versions of the migration code previously used in the "terraform" package. State data is upgraded to the latest version step by step and then transformed into the in-memory state representation, which is distinct from any of the file format structs in this package to enable these to evolve separately. For writing, only the latest version (4) is supported, which is a new format that is a slightly-flattened version of the new in-memory state models introduced in the prior commit. This format retains the outputs from only the root module and it flattens out the module and instance parts of the hierarchy by including the identifiers for these inside the child object. The loader then reconstructs the multi-layer structure we use for more convenient access in memory. For now, the only testing in this package is of round-tripping different versions of state through a read and a write, ensuring the output is as desired. This exercises all of the reading, upgrading, and writing functions but should be augmented in later commits to improve coverage and introduce more focused tests for specific parts of the functionality.
This idea of a "state manager" was previously modelled via the confusingly-named state.State interface, which we've been calling a "state manager" only in some local variable names in situations where there were also *terraform.State variables. As part of reworking our state models to make room for the new type system, we also need to change what was previously the state.StateReader interface. Since we've found the previous organization confusing anyway, here we just copy all of those interfaces over into statemgr where we can make the relationship to states.State hopefully a little clearer. This is not yet a complete move of the functionality from "state", since we're not yet ready to break existing callers. In a future commit we'll turn the interfaces in the old "state" package into aliases of the interfaces in this package, and update all the implementers of what will by then be statemgr.Reader to use *states.State instead of *terraform.State. This also includes an adaptation of what was previously state.LocalState into statemgr.FileSystem, using the new state serialization functionality from package statefile instead of the old terraform.ReadState and terraform.WriteState.
This is a wrapper around State that is able to perform higher-level manipulations (at the granularity of the entire state) in a concurrency-safe manner, using the lower-level APIs exposed by State and all of the types it contains. The granularity of a SyncState operation roughly matches the granularity off a state-related EvalNode in the "terraform" package, performing a sequence of more primitive operations while guaranteeing atomicity of the entire change. As a compromise for convenience of usage, it's still possible to access the individual state data objects via this API, but they are always copied before returning to ensure that two distinct callers cannot have data races. Callers should access the most granular object possible for their operation.
40c8de3
to
f5e25ff
Compare
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
As part of the ongoing effort to implement a consistent value type system across all of Terraform, we need to revamp our state format to properly represent structured data. While we could've made a more minimal change here to just swap out the representations of resource attributes and output values, we're also going to take this opportunity (since we're making a breaking change anyway) to lay the groundwork for some other changes expected in forthcoming releases.
Following the lead of how we reorganized configuration loading in an earlier commit, here we have some new packages that are intended to eventually replace the functionality of some other components that are currently mixed across various other packages. The intent here is to have a similar three-layer architecture as for configuration:
hashicorp/terraform/configs
+hashicorp/hcl2/hcl
hashicorp/terraform/states
hashicorp/terraform/configs/configload
hashicorp/terraform/states/statemgr
hashicorp/hcl2/hclparse
hashicorp/terraform/states/statefile
The separation adopted here for both config and state is intended to organize around caller cohorts: the types in the "models" package will be used from many different files in Terraform, while the "manager" APIs deal with logic for loading and saving the models and are used in relatively fewer places (mainly just in the backends, and other "early" contexts). The low-level file format handling is in turn used primarily by the managers.
Since this PR includes three separate packages, the following sections will give an overview of the high-level goals and any other notable aspects of each package.
states
: model typesPrior to this change, the model types for states lived inside the
terraform
package. As part of our ongoing effort to split apart that package into logical smaller parts, the new equivalents of those types are now in the top-level package directorystates
, alongsideconfigs
.Some of the state-related types in the
terraform
package had misleading names due to historical terminology drift, and so this new package is also an opportunity to re-align that terminology while introducing into the state a first-class representation of resources themselves, where the old models represented only resource instances. The approximate correspondences between old and new are:terraform
states
terraform.State
states.State
terraform.ModuleState
states.Module
states.Resource
terraform.ResourceState
states.ResourceInstance
terraform.InstanceState
states.ResourceInstanceObject
terraform.OutputState
states.OutputValue
An overview of the general shape of the new
states.State
data structure can be seen in the expected value for one of the tests:terraform/states/state_test.go
Lines 46 to 99 in 6bb08cf
It is inevitable that we will make further changes to these models as we work to integrate them into existing codepaths. This PR is really just a first pass to sketch out the overall structure, with the intent of fleshing out and adjusting the details as we continue down this road. With this in mind, there is relatively low test coverage of the new methods in order to avoid friction during ongoing iteration.
Explicitly Modelling Resources
As noted above, the existing state models lack an object representing the state of an entire resource, instead skipping straight to resource instances. Since resources and resource instances are becoming usable as object values in their own right in the next release, it is useful to track some resource-level metadata in order to explicitly represent resources that have
count = 0
, and to lay the groundwork for later supporting instances with string keys in the plannedfor_each
feature.Alongside the map of instances, resource states track the following metadata:
Instance Generations and Object Statuses
Terraform uses the ideas of "tainted" and "deposed" to manage different sorts of partial operations, but the prior models didn't make these very explicit and we lacked some nouns for talking about these ideas and how they relate to one another.
These new models introduce some new vocabulary in an attempt to model these more explicitly:
An important detail to note here is that there is an additional level of heirarchy that our previous mental models previously often missed: each instance has potentially multiple objects -- zero or one "current", zero or more "deposed" -- and each of those objects has its own status.
The prior models tracked deposed objects by their index within a slice, which meant that they lacked an identity that would be stable across runs (if one of several deposed instances is destroyed, for example). The new model assigns each deposed instance a pseudo-random eight-character hex unique identifier that remains stable as other deposed instances are added and removed, and lays groundwork for being able to consistently correlate deposed instances between the state, plan, and UI output.
Instance Object Attributes
In the old state model, instance object attributes are tracked as a
map[string]string
, with complex data structures flattened into dot-separated string keys usingflatmap
. The flatmap encoding is not rich enough to represent the new type system in full, and so a new representation is required.To ensure robust handling of object attributes set in configuration, we've introduced to Terraform Core the idea of resource type schemas, allowing Terraform to do initial processing and validation of configuration before passing the result onto providers for further processing.
Terraform state has the additional constraint that existing persisted states must be accepted by newer versions of Terraform and newer providers. While Terraform Core can use the resource type schemas alone to decode configuration, the persisted state may contain state data that conforms to the schema from an older provider version, and so it must be "migrated" to conform to the latest version before use. Migration is performed via arbitrary logic in the provider, and so Core alone cannot fully decode object attributes during state loading.
The new state models therefore follow the lead of the config models again by performing partial decoding, retrieving all of the fixed structure of the state but leaving the instance object attributes in a raw form so that upgrading and interpretation can be delayed until a later stage, when provider plugins are available.
Although Terraform Core currently has shims in place to interpret a flatmap encoding into a full
cty.Value
given the expected value type, the expected type for an earlier version is not available to Terraform Core. To deal with this,states.ResourceInstanceObject
can represent object attributes either as a JSON representation of the object value yet to be decoded or as a flatmap. When encountering an object using an outdated resource type schema, Terraform Core must first pass these raw values to the provider for migration before they can be finally resolved into acty.Value
conforming to the current schema.In practice, most of Terraform Core doesn't care about object attributes and so this raw encoding is not a problem. The exception is the code that deals with references like
aws_instance.foo
in the configuration language, which must be able to place the instance object's final value into the expression scope. In practice, this codepath is already dealing with the flatmap encoding today and so we know that it has sufficient information available to interpret the raw attribute data on the condition that all objects are migrated to the latest schema prior to evaluating any expressions.Concurrency
The old state models all had unexported mutex fields inside them with the intent that callers would lock these objects before working with them. In practice we get by with less-granular locking of the entire state structure managed within the graph walker, and so those embedded mutexes are removed here.
The new position on concurrency is hopefully more straightforward: the
states
models are explicitly not safe for concurrent writes or reads concurrent with writes, and it's always the caller's responsibility to handle mutual exclusion where required using some suitable means. In practice, that will be the state mutex already present in our graph walker for the forseeable future.Where I expected the requirements would not be obvious, I made a point of calling out any concurrency-related caveats in the documentation comments for each function.
statefile
: state file formatsThe
statefile
package is responsible for serializing the persistent subset of astates.State
so that it can be stored as an opaque blob in various blob storage systems.This functionality was previously served by directly mapping the in-memory models to JSON using
encoding/json
and struct tags. That approach was convenient, but it forces the structs to be a compromise between what is convenient for efficient in-memory use and what is convenient in a file format, and makes it hard for us to evolve either over time without affecting the other.The file format handling is therefore now a separate concern from the in-memory models, and hand-written code translates between the in-memory format and the serialized format. Internally
encoding/json
is still used, but it is used with unexported types tailored to the need of the file formats.terraform
statefile
s = terraform.ReadState(r)
s = statefile.Read(r)
terraform.WriteState(s, w)
statefile.Write(s, w)
s.MarshalEqual(o)
statefile.StatesMarshalEqual(s, o)
We also need to accept older generations of our state file format for reading. The
statefile
package handles this automatically using a combination of existing code adapted from theterraform
package (for versions up to 3) and newly-written code to migrate version 3 to the new version 4.State format version 4
The new state format is designed to store the subset of state information that needs to persist between runs in a compact way that is will hopefully be convenient for other software to interpret once we are ready to promise compatibility with it. (That compatibility promise will not come with the first release, to allow for any adjustments to the format that might become necessary based on real-world feedback.)
The main changes are:
terraform output
and for theterraform_remote_state
data source which both lack the context necessary to re-evaluate these.count
andfor_each
is added to modules in a later release.deposed
indicating the pseudo-randomly-allocated deposed generation key for non-current objects.Although state version 4 primarily uses real JSON data structures to represent attribute values, it also supports flatmap encoding as a fallback behavior in order to properly model instance objects with outdated schema versions that have not yet been migrated.
The general shape of the new format can be seen in some of the test fixtures in this package, including the following simple one:
terraform/states/statefile/testdata/roundtrip/v4-simple.in.tfstate
Lines 1 to 58 in 6bb08cf
statemgr
: state managersThe state manager concept is left mostly untouched here compared to what was previously in the top-level package
state
(not the newstates
). Some naming is adjusted to hopefully read better in calling code:state
states/statemgr
state.State
statemgr.Full
state.StateReader
statemgr.Reader
state.StateWriter
statemgr.Writer
state.StateRefresher
statemgr.Refresher
state.StatePersister
statemgr.Persister
state.Locker
statemgr.Locker
The contents of each of these interfaces is intentionally left unchanged so that the
state
versions can eventually become type aliases, with the exception ofStateReader
andStateWriter
now dealing with*states.State
values instead of*terraform.State
.Since the idea of a state and the idea of a state file are now separated into two packages, state managers have the new role of translating between those two "worlds". In particular, this completes our work started earlier to move the management of
Lineage
andSerial
to be handled only within state managers, with these now not even being visible to callers. In order to retain the existing testing utilities that verify correct handling ofLineage
andSerial
I introduce a new optional interface to allow the test utility to access these; I don't expect any "real" caller to need them, but they are available if e.g. we want to include these values in logs.In an attempt to clarify the relationships between the
Reader
/Writer
pair and theRefresher
/Persister
pair, I introduced new interfacesstatemgr.Transient
andstatemgr.Persistent
that are present mainly just to hold documentation that attempts to introduce the concepts of transient snapshots and persistent snapshots, including how they relate to each other and to the mutable state objects they are built from. However, this also presented the opportunity to recaststate.InMemState
as an implementation just ofstatemgr.Transient
, making it clear within the type system that it is not actually capable of persistence, withstatemgr.NewFakeFull
returning an explicitly-fake implementation ofstatemgr.Full
that is documented as being only for tests.The type
statemgr.Filesystem
is a light adaptation ofstate.LocalState
to implementstatemgr.Full
instead ofstate.State
. Its functionality is unchanged, which means it still incorrectly implements the interface by handling persistence insideWriter
rather thanPersister
. We can address that in a later project.There are various other state manager implementations in
state
and its subdirectories. These are not yet adapted to the new interfaces and will be dealt with in later changes, since this changeset is large enough as it is.