Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top-level Configuration Abstractions #23378

Closed
apparentlymart opened this issue Nov 14, 2019 · 3 comments
Closed

Top-level Configuration Abstractions #23378

apparentlymart opened this issue Nov 14, 2019 · 3 comments

Comments

@apparentlymart
Copy link
Contributor

I'm making this new issue to capture a use-case originally shared by @jakebiesinger-onduo in a comment on a PR, so that we can more easily discuss the use-case separately from the specific PR. I have added some general commentary from my own perspective here, so please refer back to the original comment to see the use-case as it was originally presented.

Current Terraform Version

Terraform v0.12.14

Use-cases

Terraform is commonly by people who are familiar with Terraform concepts and are comfortable writing Terraform configurations, but there are other situations where Terraform is more of an implementation detail, where the intended interface to users is a higher-level configuration file or other interface that raises the level of abstraction in some way.

For example, @jakebiesinger-onduo shared this snippet of a higher-level configuration object for configuring SFTP-based data synchronization:

locals = {
  config = {
    partner1 = {
      sftp-server = { hostname = ..., user = ... }
      pull-config = { 
        schedule = { cron = "0 6 * * * *" } 
        patterns = [ {src_dir = "incoming", regex = ".*\.csv$"} ]
        encryption = { pgp-private-key = "path-to-some-key" }
      }
      push-config = {
        trigger { 
           use_gcs_trigger= true 
           bucket = google_storage_bucket.outgoing["partner1"].name 
        }
        patterns = [ { dest_dir = "outgoing", regex = ".*" } ]
        encryption = { pgp-public-key = "some-key" }
      }
    }
    partner2 = {
      partner-gcs-bucket = { bucket = "some-partner-bucket", prefix="path/to/stuff" }
      pull-config = { 
        trigger { pubsub = google_pubsub_topic.incoming["partner2"].id }
        patterns = [ {src_dir = "incoming", regex = ".*\.csv$"} ]
        # no encryption for this partner.
      }
      push-config = {
       trigger { use_gcs_trigger = false, bucket = google_storage_bucket.outgoing["partner2"].name }
       patterns = [ { dest_dir = "outgoing", regex = ".*" } ]
        # no encryption for this partner.
     }
  }
   partner3 = {
      partner-gcs-bucket = { ... }
      pull-config = {  ... }
      # no push config for this partner-- we only receive data.
  }
}

This configuration format raises the level of abstraction by introducing concepts like partners, triggers, sources, and destinations. The underlying Terraform concepts of modules and resources are encapsulated, allowing the author of such a configuration to think primarily about the higher-level problem being solved and less about the physical infrastructure or Terraform concepts used to implement it.

While Terraform does allow defining a higher-level data structure like the above, the module author must then write expressions to work with that data structure, which runs into challenges of normalization, validation, and conditional action based on how the structure is shaped.

Attempted Solutions

Consider for example the problem of identifying which subset of the partners has opted in to GCS-based triggering by setting use_gcs_trigger to true. We might want to permit the absense of GCS triggering to be expressed either by setting use_gcs_trigger to false, by omitting the trigger attribute in the parent object entirely, or by omitting the pull-config attribute from the partner object entirely.

In order to give configuration authors that flexibility in the Terraform language today requires some quite awkward expression constructions:

lookup(lookup(lookup(features, "push-config", {}), "trigger", {}), "use_gcs_trigger", false)))

Our current recommendation is to use module composition so that the module blocks themselves are representing the configuration concepts, which means expressing the configuration using some Terraform language concepts (modules) while still getting some abstraction. For example, we might imagine each partner in the above example having its own module that contains a call to one of many trigger implementations, one of many "source" implementations, and one of many "target" implementations, all connected together using normal module composition techniques:

# (this is just a general illustration of module composition using the nouns
# from the above example; it's likely not reflective of the real architecture
# that would be required for that in practice.)

module "source" {
  # one of several implementations of the "source" concept
  source = "./modules/sftp-source"

  sftp_server = {
    host = "..."
    user = "..."
    # ...
  }

}

module "destination" {
  # one of several implementations of the "destination" concept
  source = "./modules/gcs-bucket-destination"

  bucket = {
    name   = "some_partner_bucket"
    prefix = "path/to/stuff"
  }
}

module "trigger" {
  # one of several implementations of the "trigger" concept
  source = "./modules/gcs-trigger"

  bucket = {
    bucket = google_storage_bucket.outgoing.name
    prefix = ""
  }
  action = module.destination.action
}

While we often can model problems this way, it requires the problem to be described within structures imposed by Terraform itself, potentially forcing a less intuitive mental model on those writing the configuration and, most significantly for the use-case we're discussing here, forces those working with the system to be aware that Terraform is being used at all.

Proposals

So far we don't have a single proposal to address this, but there's a collection of existing discussions that might interact with this use-case:

  • for_each for modules: could make it more reasonable to wrap module composition in an additional level of abstraction to translate between a higher-level structure and a set of module calls.
  • jsonpath function proposal (though would likely be a HCL-traversal-syntax based equivalent if we were to move forward with it, to avoid introducing another complex syntax into the Terraform language): can potentially make it easier to work with data structures whose shape is not predictable, as a shorthand for nested lookup calls with defaults.
  • Custom validation logic: complex data structures cannot generally be described entirely with a simple type system. Having support for writing custom validation logic could, if designed well, help make sure that these little languages implemented inside Terraform can still give their users good feedback when the input isn't correct.

If we think about a goal of hiding Terraform entirely from the person who is writing these configurations (which is a more extreme version of this use-case, not necessarily a requirement) then there are some potential solutions in just-in-time Terraform code generation using external software, rather than expressing the entire solution inside Terraform itself. In return for the additional complexity of introducing some more software, the door is opened to possibly select a more appropriate source format for the configuration, such as a custom HCL-based format, or some other domain-specific format that makes sense for the target application.

Hopefully we can collect some other similar examples in this issue to see what common elements help to solve them.

@jakebiesinger-onduo
Copy link

jakebiesinger-onduo commented Nov 14, 2019

Thanks, @apparentlymart, I think you've captured my use case nicely here and the referenced discussions are perfectly topical.

One note I'll add:

Our configs are written by humans and we really like using HCL for this task! We prefer doing so over other alternative languages.

  • JSON is ugly and a pain for humans to write and get all the silly quoting and , rules correct
  • YAML has awkward indentation requirements and some odd power features that can confuse folks (&, *, and <<< all come to mind).

It's easy to get these languages wrong. HCL on the other hand has reasonable declaration and reference mechanisms (define a locals block and refer back to common values via local.foo), clean map definitions, and the ability, when required, to pull in real terraform objects (this breaks with the notion of "config abstracts terraform away entirely" but is still useful in some cases... we don't take a hard line on separating them, but often end up w/configs that have little-to-no terraform)

@apparentlymart
Copy link
Contributor Author

Hi all!

Revisiting this some time later, I'm noticing that all of the different ideas listed under "Proposals" have since been met via some new language feature, even if the final design didn't exactly match what was proposed when I opened this:

  • count and for_each work for module blocks since v0.13.0.
  • The relevant parts of the jsonpath proposal -- the ability to just try traversing through a data structure in multiple steps and fall back to a replacement value without having to separately lookup each one -- were met with the addition of the try function.
  • Custom variable validation (validation blocks in variable blocks) can in v0.13.0 and v1.2.0 recently extended that idea to resources and output values via preconditions and postconditions, all of which are described under Custom Condition Checks.

Since this issue was representing a broad problem space, it's not really possible to objectively decide if it's "done", but it does seem like there are now language features addressing each of the points raised in the write-up and so I'm going to close this now. If anyone has any specific feedback about those features or about other missing pieces then I'd encourage opening a new issue where we can focus on that specific feedback. Thanks!

@apparentlymart apparentlymart closed this as not planned Won't fix, can't repro, duplicate, stale Aug 29, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants