-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source bundler: Unified source code fetching for Terraform CLI and Terraform Cloud #42
Conversation
The handling of .terraformignore files was previously just embedded directly in the root package of this repository as a bunch of unexported symbols. In future commits we'll introduce another package to this codebase that will also need to deal with .terraformignore files, but in a slightly different context. In preparation, here we factor out that handling into a separate internal package and adapt the root package to call into that package instead of using its own inline implementation. This carefully preserves the existing caller's ability to ignore various errors, while now allowing new callers to handle errors differently.
This package models the source address types we'll support for modules when working with the forthcoming "source bundle" mechanism, which is essentially a "meta-slug" capturing the contents of many different source packages all at once. For the resulting source bundle to be useful we need to be able to talk about the source addresses it was built from, so that callers can request the bundle path equivalent to a particular remote source address path. This address syntax is intentionally a subset of the go-getter-based syntax used in today's Terraform CLI module installer, because we want to have a set of commonly-used address types that are treated equivalently in both implementations. However, we won't be actually using go-getter here because its high level of flexibility (and quirkiness) is a poor fit for safely constructing persistable source code bundles.
We typically decide between using HCL native syntax vs. JSON HCL syntax based on filename suffixes, and so it's helpful to have a robust way to retrieve just the filename portion of an arbitrary source address (as long as it's referring to a file) which takes into account edge-cases such as the query string portion of remote source addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In as far as I can comprehend it, this looks good to me! It's hard to review in isolation, so I'm sure it'll make more sense once I see a consumer of the new packages.
I'm surprised to see relatively little unit testing in the sourceaddrs package, especially the tricky-feeling subpath functions, but the important pieces seem covered by larger integration-type tests. Do you intend to expand the test coverage later?
My only feedback is a typo in one of the commit messages:
LocalSource and RemoteSource implement both interfaces, while
RegistrySource only implements Source and the new RegistrySourceRemote
only implements FinalSource.
I think this should read "…the new RegistrySourceFinal only implements FinalSource".
You are correct about the commit message wording indeed. I'll fix that. Also you are right that I erred more towards integration-style testing rather than unit testing here because we're still "getting the feel for" the design here and so I didn't want to be blocked by constantly rewriting lots of tests. But I do agree that adding more unit tests would be an important prerequisite for this graduating from having an explicit "this is experimental" warning on it, if not before. |
Terraform module source addresses have a historical design oddity where both local and remote source addresses are self-contained but registry source addresses must always be combined with a version constraint and then resolved through an extra step of selecting the latest available version that matches that version constraint. This means that we cannot use the same type to represent both a package to be installed and a package that has already been resolved. To fill that gap here we introduce a variant address type sourceaddrs.FinalSource which represents addresses of packages that have already been installed, rather that addresses of packages that are requested to be installed. LocalSource and RemoteSource implement both interfaces, while RegistrySource only implements Source and the new RegistrySourceFinal only implements FinalSource. This then allows other subsystems that are built in terms of these address types to be explicit about whether they are expecting an address of something to be installed or an address of something that has already been installed, and thus the Go compiler can help ensure that we handle both cases fully rather than forgetting about the need for the extra version selection information for already-installed registry module source addresses.
Since this is entirely new functionality that wasn't previously in this codebase's scope, we'll need to gather some real experience using this new API before we can be confident that it's suitable to commit to. At the time of this commit this entire codebase is pre-1.0 and so subject to breaking changes anyway, but these extra notes are here just in case the root package functionality gets stabilized before these new packages are ready to be stabilized. We'll remove these warnings at a later date once we've got enough experience with these new packages in internal projects to confidently commit to a public-facing API.
Currently this function can only possibly return RemoteSource values anyway and so the distinction between Source and FinalSource is moot, but semantically it's more correct to say that we're returning a finalized source address here and so using the more correct type will make this compose better with other functionality which works generically with final source addresses, and will also allow us to potentially make this return registry addresses sometimes if we decide that gives better results for the assumed use-case of producing diagnostic messages.
120227c
to
386bf9b
Compare
package sourceaddrs | ||
|
||
import ( | ||
"github.com/apparentlymart/go-versions/versions" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Post-hoc comment, but this is the first time I am seeing this - why not hashicorp/go-version? We use that regularly throughout the rest of TFC's codebases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use this library in Terraform because go-version
tends to generate useless error messages and historically it didn't handle prereleases correctly, although I think that latter point may at least have been fixed in the meantime. But consistency with what Terraform itself does is the main thing here, so better to use the library it uses.
So far Terraform Cloud has used an entirely separate mechanism for fetching the source code for a root module than it does for installing any child module packages. This is largely a consequence of how Terraform Cloud started its life as little more than an automation wrapper around Terraform CLI, mimicking how someone would use Terraform CLI locally: a local user typically starts by fetching the source code using something like
git clone
, and then runsterraform init
to deal with everything else.As we work to present a more unified experience between these two entry points, it's been increasingly frustrating to have this slightly different treatment in these two cases, since it tends to lead to confusing edge-cases such as how
.terraformignore
files only get handled for the package containing the root module in Terraform Cloud and not in any other case. It also means that Terraform Cloud needs to work pretty hard to preserve the installed child modules between the plan phase and the apply phase since re-runningterraform init
won't necessarily reproduce identical source code, if e.g. new commits have been pushed to the remote Git branch in the meantime.As a foundation for future Terraform features (not yet planned or announced) that more tightly integrate Terraform CLI and Terraform Cloud, here we implement a carefully-selected subset of Terraform CLI's module installation behavior in a way that can be called from both Terraform CLI and Terraform Cloud, producing an equivalent result in both contexts.
We've added this to
go-slug
because it's already the shared library between Terraform CLI and Terraform Cloud that deals with the problem of packaging up source code. This new functionality introduces the idea of a "source bundle", which is in some sense a "meta-slug": a bunch of source code packages frozen and bundled together into a single directory/archive for later use.The two new exported packages are explicitly marked as experimental in their documentation because we anticipate needing to make at least some small tweaks to the API as we start to make use of this in some early real-world experiments. We have written a small amount of "realistic-ish" code using this API already to help design it, but need to reserve the possibility of making some breaking changes if we learn that there's something missing. This whole codebase is still currently pre-1.0 anyway, but these extra disclaimers of compatibility mean that the main existing scope of this codebase could potentially reach 1.0 before this new functionality does.
Terraform's existing module installer is based on go-getter, which is in retrospect not a great foundation for this sort of problem because it's far too flexible and therefore gives poor user feedback when problems occur and allows many different ways of writing essentially the same source address that make it hard to optimize installation by reusing existing cached packages.
However, we intend to use this only for new features that users would explicitly opt into, and so it's not necessary to fully replicate everything Terraform CLI would allow today: the existing module installer in
terraform init
won't be going anywhere for the foreseeable future, so those who need its more esoteric capabilities can continue using Terraform as they to today.With that in mind, what we have here is essentially a subset of Terraform CLI's module installation capabilities:
For everything that this implementation and Terraform CLI have in common, the goal is to use a compatible source address syntax. That means that if someone writes a module for the subset that we support in source bundler then it should also work with Terraform CLI's traditional module installer, but not necessarily vice-versa.
The two initial capabilities described above happen to align exactly with what Terraform Cloud itself can currently fetch for root modules, so it's a natural place to draw the line to start -- we don't need to teach Terraform Cloud about any new installation sources -- and can be expanded later if needed as long as we're adding source types that both Terraform CLI and Terraform Cloud can reasonably support (where Terraform Cloud's multi-tenant nature imposes some additional constraints compared to CLI).
Part of the reason to place this new use-case in this existing repository is that this adds support for
.terraformignore
for all source packages -- both for root modules and for downstream modules. The definition of the.terraformignore
file format belongs to this codebase and is now shared between the single-directory "slug" implementation and the source bundle implementation.That does mean that sourcebundle-based features will behave marginally differently to Terraform CLI's existing module installer if installing from source packages that have
.terraformignore
files in the root: those files will not be available under sourcebundle-based usage but will be available when installed with Terraform CLI.This is another example of the source bundle behavior being effectively a subset of the Terraform CLI behavior: it is possible to write a source package that will work in both contexts either by not including
.terraformignore
at all or by making sure that the modules inside don't try to access any of the files that are excluded by.terraformignore
. Those who just continue using Terraform CLI as-is will be unaffected even if they do have a.terraformignore
file in their source package.