-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: "external" provider #8144
Comments
+1 to this bad boy. Would really help obviate the need for complex workarounds. |
@apparentlymart - looks good, and reminds me a lot of how drone.io implemented their plugin system.
|
@apparentlymart I understand where you're coming from, it sounds like #1156 I raised a while ago which eventually emerged into getting I have mixed feelings when thinking about it ~1year after opening the mentioned issue. Part of me agrees with @phinze
and the other part of me agrees with you - #1156 (comment) To comment on the actual proposal: I would be concerned about breaking cross-platform compatibility + dependency expectations. This is one of Terraform's/Go advantages people don't always realise and they may get bitten by when they try to implement their own Terraform 😃 . e.g. In your example above you just expect the system to provide Python of a particular version, which may grow into expecting some PyPI dependencies too. The same story applies to pretty much any interpreted language (shell included). |
@radeksimko yeah, the dependency on python and whatever else is something that would make this hard to use in my company's production environment, where we run Terraform on deliberately-very-minimal dedicated machines. My assumption here was that people were using (I would hope that in most "production-ish" situations Terraform isn't being running on assorted different machines anyway, and so it shouldn't be incredibly hard to ensure that it has an appropriate environment around it for something like this. That's already true if e.g. you have out-of-tree Terraform plugins that your configuration depends on.) I think the key difference in this proposal over what you proposed in #1156 is that in your proposal you were essentially just providing an alternative syntax for running arbitrary commands, whereas the protocol described here makes the full data source and resource lifecycles available to external glue code, thus hopefully making these external scripts act as more well-behaved elements of Terraform's model... hopefully resulting in less self-foot-shooting. 😀 Continuing the theme of self-foot-shooting: I suppose having seen loads of examples of what people are doing to work around this problem, I'm feeling like Terraform should provide a recommended path that can come with suitable docs that explain the caveats. That way at least people are going into this with eyes wide open, and not instead getting something to work and then discovering only later the implications of what they built. |
FWIW, this would be something that I could use to solve me Chef de-provisioning issues from #4121 (comment) after a stack has been destroyed. |
@apparentlymart I suggest instead of supporting multiple interpreters do the following:
This will allow to use any program you want which supports I/O standard you described in 1st post. And formally TF still will be dependencies free :) |
Also maybe it's reasonable to start with data provider only and implement resource in 2nd development iteration. Looks like |
@StyleT the data source is indeed simpler. I'd like to discuss them both together for a little while because the two should ideally end up having a very similar protocol, but you are right that there's no need for them to be implemented at the same time. |
This is a fantastic proposal, and I look forward to its implementation. 👍 There are, of course, alternate ways to accomplish the goal of feeding inputs to Terraform, such as wrapping Terraform with programs that populate I don't share other commenters' concern about whatever other dependencies this proposal might allow others to introduce into their environment. To borrow from the Python example, if someone wants to build a stack that uses Terraform and Python together, that's their own business, and it would be unnecessarily prescriptive of us to prevent them from doing so. (Not that we could, anyway, because Terraform might be wrapped by a Python script.) Forcing dependencies on a user, of course, is another matter entirely; but this proposal doesn't seek to do that. |
I noticed that we have new I have created a shell provider https://github.com/toddnni/terraform-provider-shell that can be used to wrap external programs to terraform. I would be happy to have same functionality covered in By the way, there is even similar |
Thanks for that, @toddnni! The work for 0.8 just added the data source proposed here. I am planning to add support for a resource too, but wanted to see how the data source plays out first in case there are some design improvements we can learn from experience of its use. Next time I'm looking at this I'll have a look at your provider in some more detail. It looks like in your case you went for generic shell-style wrapping and just capturing the stdout verbatim, whereas my proposal here has a more structure protocol that requires a more specialized external program. I think both are reasonable approaches with different tradeoffs... I'm curious to see how people will use the |
@apparentlymart If it helps, here's how I'm using this data source. It isn't perfect (or really even a best practice) but this let's us automate the installation of the pip modules for an aws lambda function and trigger an upload if one was done. pip.tf
|
We're very interested in this feature. We're a Python and JVM shop and really do not have a ton of interest in maintaining Go source for the custom modules we want to build. |
This was actually implemented and released by @apparentlymart :) |
Does
It seems heavy overkill to have to write a wrapper to |
To add to the previous comment, same thing with input. Lots of commands just expect a string on input, not necessarily json-formatted. A contrived but simple example is if I want to take some data in a template_file and convert all
|
And I came across the need for this again. Both of those do a great job providing data to stdout that I could use to drive the |
Are there any plans to implement external resource? |
When I wanted the local hostname, I just used trimspace(file("/etc/hostname")) -- trimspace because the hostname file has an extra line break that I didn't want. I haven't tried it but you could probably use local-exec or https://www.terraform.io/docs/providers/local/r/file.html to write a file to the local disk, and then the file() interpolation to read back the value (might need depends_on blocks to make sure things happen in the right order). |
@mtougeron thanks, I saw that too. I was referring to the 2nd part of the proposal: the |
Yeah I think the desire is to be able to quickly get the output from a shell command and use it. In my case, I just wanted the output from the "hostname" command but was able to read /etc/hostname instead. I didn't want to write a python script or something else just to get the local hostname. |
I wrote a simple python helper for the terraform external provider. It's super simple, but it does let you just focus on the custom logic you want to implement in python. https://gist.github.com/lorengordon/f4ceaa95b9fe669ee533a8aa40b955c1 |
These external data sources get run every time - if you just want to get output once, I made a module for this https://github.com/matti/terraform-shell-resource |
@apparentlymart It looks like the external_resource was only implemented as a data source, as opposed to giving the ability for create/read/update/delete. Any chance this will be extended to include CRUD capabilities? I have exactly this use case - being able to bolt-in lightweight integrations to systems where integrations do not currently exist without having to do so with Go (which I would need to learn). |
Hi @rayterrill! The data source was implemented first in order to get a sense of how well it work work in practice, what sorts of use-cases would be implemented with it, etc. Based on that, we've seen feedback that the protocol between Terraform and the external program is too restrictive (forcing maps of strings only in both directions) and other such ergonomic problems with this approach. Given that the
Since the multi-language provider idea will probably take some time to get to a good, usable state it is possible that we will find an interim solution similar to the |
@apparentlymart I'm not sure I totally understand - Maybe if I provide an example of what I'm trying to accomplish, it'll help. I'd ideally like to be able to plug in PowerShell script(s) into my Terraform configurations to handle things that are not API-enabled - I'm thinking things like AD DNS (really AD in general), custom inventory systems, etc. To do that, I'd really need some mechanism to understand create vs update vs destroy so I would know when I need to create or update something (apply), or remove something (destroy) - say a DNS CNAME in Active Directory. Would the local_exec data source allow that possibility? Just thinking out loud - maybe some way to have external programs conform to a standard - like they would need to handle a "action" parameter or something that would tell them whether terraform was run with apply or destroy, and then some mechanism to handle the consumption of all of the attributes defined in the local_exec or external configuration entry? Maybe this already works and I just don't know how to use it? |
A possible way we could have a medium-term solution here is to have a # design idea; not currently implemented
resource "local_exec" "example" {
create_command = ["your-program", "create", "..."]
update_command = ["your-program", "update", "..."]
read_command = ["your-program", "read", "..."]
delete_command = ["your-program", "delete", "..."]
} The thing we'd need to figure out here is how best to export the results of such a resource. It could just be that there's a single string attribute exported that contains the raw stdout output of the most recently-run command, and it's up to the user to make sure that all of the commands produce output in a consistent format. It sounds like that would work for your situation, where you'd be writing these programs specifically with Terraform in mind. The main difference here with |
^ that sounds awesome! |
I'm with @bpoland - the design idea for a local_exec resource sounds awesome @apparentlymart. :) |
Thanks for the feedback! We won't be able to act on that immediately since we need to complete the configuration language project first (in particular, so that the mentioned |
+1, I can confirm I would replace my current usage of the |
What's a concrete use case for
Is |
Read would be executed on every apply during the refresh stage, to make sure that the resource still exists and is configured properly. If any out of band changes are made, this allows terraform to detect and correct them. |
I would love to have this ability. Due to the strictness of the |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
We repeatedly see people trying to use external code (usually via
local-exec
provisioners, but sometimes via wrapper scripts) to extend Terraform for their use-cases in a lighter way than writing provider code in Go.This is a proposal for a more "official" way to do these lightweight integrations, such that they can work within the primary Terraform workflow.
The crux of this proposal is to define a "gateway interface" between Terraform and external programs, so that a user can a write separate program in any language of their choice that implements the gateway protocol, and have Terraform execute it. This approach is inspired by designs such as inetd and CGI which interact with a child process using simple protocol primitives: environment variables, command line arguments, and stdio streams.
external_data_source
data sourceWhen evaluating this data source, Terraform will create a child process and exec
python ./example.py
, writing a JSON-encoded version of the contents ofquery
to the program's stdin.The program should read the JSON payload from stdin, do any work it wants to do, and then print a single valid JSON object to stdout before exiting with a status of zero to indicate success.
Terraform then parses that JSON payload and exposes it in a map attribute
result
, from which the results can be interpolated elsewhere in the configuration:It's the responsibility of the developer of the external program to make sure it acts in a side-effect-free fashion, as is expected for data source reads.
external_resource
resourceWhen evaluating this resource, there is a separate protocol for each of the resource lifecycle actions. The common aspect of all of these is that Terraform creates a child process and runs
python ./example.py
with a single additional argument containing the name of the action: "create", "read", "update" or "delete".Just as with the data source above, there is a map attribute
result
. For resources, theresult
attribute corresponds toComputed
attributes in first-class resources, whilearguments
corresponds toOptional
(though the external program may have some additional validation rules it enforces once it's run).The protocol for each of these actions described in the following sections.
read
For read, the given program is run with
read
as an argument, and Terraform writes to its stdin a JSON payload like the following, usingarguments
andid
from the existing state:The program must print a valid JSON mapping to its stdout, with top-level keys
id
,arguments
andresult
, as follows:The program must exit with a status of zero to signal success in order for Terraform to accept the result.
Terraform updates the
arguments
andresult
mappings in the state to match what's returned by the program here.It's the responsibility of the developer of the external program to make sure that
read
acts in a side-effect-free fashion.create
andupdate
Create and update follow a very similar protocol: the given program is run with either
create
orupdate
as a command line argument, and Terraform writes to its stdin a JSON payload like the following:In both cases the "arguments" come from the configuration. In the "update" case the "id" and "old_arguments" come from the state, while in the "create" case they are omitted altogether.
After doing whatever side-effects are required to effect the requested change, the program must respond in the same way as for the
read
action, and Terraform will make the same changes to the state.delete
Delete is the same as
update
except that the command line argument isdelete
and the program is not expected to produce any output on stdout.If the exit status is zero then Terraform will remove the resource from the state.
Child Process Environment
For both the data source and the resource, the child process inherits the environment variables from the Terraform process, possibly giving it access to e.g. AWS security credentials.
The current working directory is undefined, and so gateway programs shouldn't depend on it. As with most cases in Terraform, it'd be better to pass the result of the
file(...)
function into the program as an argument rather than have it read files from disk itself, though the program may also choose to build paths relative to its own directory in order to load resources, etc.Error Handling
If the target program exits with a non-zero status, Terraform collects anything written to stderr and uses it as an error message for failing the operation in question.
For operations where a valid JSON object is expected on stdout, any parse error is also surfaced as an error.
Example Python Data Source
General Thoughts/Observations
Intended Uses
There are two high-level use-cases for this sort of feature, based on examples I've seen people share elsewhere in the community:
For the former case, it would be good if people would still share their use-cases as feature requests in this issue tracker so that we can eventually implement all of the capabilities of the services we support. In this case I hope the user would aspire to eliminate the use of this provider eventually.
The latter case seems like the main legitimate reason to use the "external" provider as a "final solution", particularly if an organization already has tooling in place written in another language and doesn't have the desire or resources to port it to Go.
Effect on the Terraform Ecosystem?
Were this provider to be implemented, it could be used as a crutch for integrating with systems where a Terraform provider is not yet available. Pessimistically, this could cause a lower incentive to implement "first-class" Terraform providers, hurting the Terraform ecosystem in the long run.
However, my expectation is that this interface is clumsy and inconvenient enough that it will be tolerated for short-term solutions to small problems but that there will still be a drive to implement first-class provider support for complex and common services.
Higher-level Abstractions
Just as happened with CGI, it's possible that motivated authors may create higher-level abstractions around this low-level gateway protocol in their favorite implementation language. For example, a small Python library could allow a resource to be implemented as a Python class, automatically dealing with the reading/writing and JSON serialization behind the scenes.
With that said, this protocol is intended to be simple enough to implement without much overhead in most languages, so I expect most people wouldn't bother with this sort of thing and would just code directly to the protocol.
One area that could be interesting is something that can take the
arguments
andold_arguments
properties of theupdate
request payload and make an interface like Terraform'sschema.ResourceData
, though I'd hope that people would consider writing a real Terraform provider if they find themselves doing something that complicated.Relevant other issues:
template_file
The text was updated successfully, but these errors were encountered: