"external" data source, for integrating with external programs #8768

apparentlymart · 2016-09-10T23:37:37Z

This is an implementation of the data source part of was proposed in #8144, with some modifications.

After playing with the design some more I made the following adjustments compared to what I originally proposed:

The data source and resource will be called just external, since data "external_data_source" (etc) felt redundant. Now you can simply say data "external" "foo", though this required a small tweak to core to allow a provider to export a resource whose name exactly matches the provider name.
Rather than having separate program and interpreter arguments, it instead expects program to be a list whose first element is the program (which might be an interpreter) and whose subsequent elements are arguments. An array is used so that expressions can be safely interpolated into arguments without needing to worry about the complicated business of escaping spaces and shell metacharacters.

Implementation Progress:

Since the resource is more complex, I'd like to land the data source first.

apparentlymart · 2016-10-30T04:27:58Z

Sorry for leaving this sitting here unfinished for so long.

@mitchellh I'm curious as to what you think of this idea philosophically. @radeksimko made some good points over in my longer proposal issue #8144 and I attempted to address them over here. While there is some risk here that this will be used as a crutch, I think it's more likely to just give people a more robust alternative to existing weird patterns using the remote_execprovisioner, as was discussed over in #8406.

Out of curiosity I implemented the example over in #8406 to be an external data source glue program written in bash with jq:

#!/bin/sh
set -ue
eval "$(jq -r '@sh "CLUSTER_SIZE=\(.cluster_size)"')"
DISCOVERY_URL="$(curl -s "https://discovery.etcd.io/new?size=$CLUSTER_SIZE")"
jq -n --arg disco_url "$DISCOVERY_URL" '{"discovery_url":$disco_url}'

Associated Terraform config:

variable "etcd_cluster_size" {
  default = 3
}

data "external" "test" {
  program = ["${path.module}/etcd_discovery.sh"]

  query {
    cluster_size = "${var.etcd_cluster_size}"
  }
}

output "discovery_url" {
  value = "${data.external.test.result["discovery_url"]}"
}

Running it:

$ terraform apply
data.external.test: Refreshing state...

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

discovery_url = https://discovery.etcd.io/c88691558304d9f9171ca0c98e238637

(In practice this one might make more sense as a resource "external" block once that's implemented, since I think you'd want to remember the generated URL after the first run rather than hitting etcd's API on every Terraform run.)

mitchellh · 2016-11-01T23:22:50Z

@apparentlymart My personal opinion is that this is both worth having and extremely dangerous. I'm all about having "escape hatches" so you're not 100% forced to work within the confines of the tool (I see JSON configs as one of those for example).

I think we just need to do our part to very actively put a yellow-border warning on the docs for this provider/resource that explains the cost of using this: lack of portability, may not work with Terraform Enterprise without additional work, etc. It puts a lot of burden on the user but at the same time could let them do things w/o being beholden to the core team here on Terraform.

Conclusion: I'm 👍 with heavy docs/warnings.

apparentlymart · 2016-11-02T05:29:24Z

Here's what I have in a yellow box in the docs right now:

Warning This mechanism is provided as an "escape hatch" for exceptional situations where a first-class Terraform provider is not more appropriate. Its capabilities are limited in comparison to a true data source, and implementing a data source via an external program is likely to hurt the portability of your Terraform configuration by creating dependencies on external programs and libraries that may not be available (or may need to be used differently) on different operating systems.

I didn't mention Terraform Enterprise here, but that's definitely worth noting. Are there any guarantees for things other than Terraform being available in the environment where Terraform Enterprise runs these things, or is it the case that this is basically useless in Enterprise unless you want to ship a statically-linked language runtime up along with the configs?

apparentlymart · 2016-11-29T20:49:13Z

@stack72 I think this is "done" as far as I'm concerned... what do you think? 😀

stack72 · 2016-11-30T12:09:05Z

Hey @apparentlymart

I like this and agree with both your's and @mitchellh's points above. The code itself is sound and the tests pass

% make testacc TEST=./builtin/providers/external
==> Checking that code complies with gofmt requirements...
go generate $(go list ./... | grep -v /terraform/vendor/)
2016/11/30 14:07:51 Generated command/internal_plugin_list.go
TF_ACC=1 go test ./builtin/providers/external -v  -timeout 120m
=== RUN   TestDataSource_basic
--- PASS: TestDataSource_basic (0.29s)
=== RUN   TestDataSource_error
--- PASS: TestDataSource_error (0.07s)
=== RUN   TestProvider
--- PASS: TestProvider (0.00s)
PASS
ok  	github.com/hashicorp/terraform/builtin/providers/external	0.377s

So I am going to leave the final yes / no to the man above //cc @mitchellh

P.

apparentlymart · 2016-11-30T19:36:43Z

When I was skimming this again I missed my own "note-to-self" about adding the statement about Terraform Enterprise to the docs. I'll work on that now, just to get this finished up.

This small function determines the dependable name of a provider for a given resource name and optional provider alias. It's simple but it's a key part of how resource nodes get connected to provider nodes so worth specifying the intended behavior in the form of a test.

If a provider only implements one resource of each type (managed vs. data) then it can be reasonable for the resource names to exactly match the provider name, if the provider name is descriptive enough for the purpose of the each resource to be obvious.

This provider will become a bit of glue to help people interface external programs with Terraform without writing a full Terraform provider. It will be nowhere near as capable as a first-class provider, but is intended as a light-touch way to integrate some pre-existing or custom system into Terraform.

A data source that executes a child process, expecting it to support a particular gateway protocol, and exports its result. This can be used as a straightforward way to retrieve data from sources that Terraform doesn't natively support..

apparentlymart · 2016-11-30T19:47:02Z

I've now rebased and added some additional yellow warning boxes to the docs to talk about Terraform Enterprise specifically, though of course I would encourage reading what I wrote there since I don't really know what guarantees (if any) the Terraform Enterprise environment provides and I may be inadvertently using terminology inconsistent with the language within that product.

stack72 · 2016-12-05T17:24:26Z

@apparentlymart we just spoke about this and it LGTG now :)

Thanks!

Paul

…corp#8768) * "external" provider for gluing in external logic This provider will become a bit of glue to help people interface external programs with Terraform without writing a full Terraform provider. It will be nowhere near as capable as a first-class provider, but is intended as a light-touch way to integrate some pre-existing or custom system into Terraform. * Unit test for the "resourceProvider" utility function This small function determines the dependable name of a provider for a given resource name and optional provider alias. It's simple but it's a key part of how resource nodes get connected to provider nodes so worth specifying the intended behavior in the form of a test. * Allow a provider to export a resource with the provider's name If a provider only implements one resource of each type (managed vs. data) then it can be reasonable for the resource names to exactly match the provider name, if the provider name is descriptive enough for the purpose of the each resource to be obvious. * provider/external: data source A data source that executes a child process, expecting it to support a particular gateway protocol, and exports its result. This can be used as a straightforward way to retrieve data from sources that Terraform doesn't natively support.. * website: documentation for the "external" provider

ghost · 2020-04-19T02:00:56Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

apparentlymart added the new-provider label Sep 10, 2016

apparentlymart force-pushed the external-provider branch from f5d9e7f to 93c47c5 Compare September 11, 2016 00:53

apparentlymart changed the title ~~[WIP] "external" provider, for integrating with external programs~~ [WIP] "external" data source, for integrating with external programs Sep 11, 2016

apparentlymart force-pushed the external-provider branch 3 times, most recently from d74c92e to 7104728 Compare October 30, 2016 03:46

apparentlymart changed the title ~~[WIP] "external" data source, for integrating with external programs~~ "external" data source, for integrating with external programs Oct 30, 2016

This was referenced Nov 8, 2016

file function does not work for future generated files #9955

Closed

Delayed execution of reading for file variables #3354

Closed

apparentlymart added 5 commits November 30, 2016 11:37

provider/external: data source

d2fb4d2

A data source that executes a child process, expecting it to support a particular gateway protocol, and exports its result. This can be used as a straightforward way to retrieve data from sources that Terraform doesn't natively support..

website: documentation for the "external" provider

7cbef09

apparentlymart force-pushed the external-provider branch from 7104728 to 7cbef09 Compare November 30, 2016 19:44

stack72 merged commit e772b45 into master Dec 5, 2016

stack72 deleted the external-provider branch December 5, 2016 17:24

lopopolo mentioned this pull request Dec 16, 2016

External provider stores full path to command in state file causing thrash when executing on different machines #10777

Closed

ghost locked and limited conversation to collaborators Apr 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"external" data source, for integrating with external programs #8768

"external" data source, for integrating with external programs #8768

apparentlymart commented Sep 10, 2016 •

edited

Loading

apparentlymart commented Oct 30, 2016 •

edited

Loading

mitchellh commented Nov 1, 2016

apparentlymart commented Nov 2, 2016

apparentlymart commented Nov 29, 2016

stack72 commented Nov 30, 2016

apparentlymart commented Nov 30, 2016 •

edited

Loading

apparentlymart commented Nov 30, 2016

stack72 commented Dec 5, 2016

ghost commented Apr 19, 2020

"external" data source, for integrating with external programs #8768

"external" data source, for integrating with external programs #8768

Conversation

apparentlymart commented Sep 10, 2016 • edited Loading

apparentlymart commented Oct 30, 2016 • edited Loading

mitchellh commented Nov 1, 2016

apparentlymart commented Nov 2, 2016

apparentlymart commented Nov 29, 2016

stack72 commented Nov 30, 2016

apparentlymart commented Nov 30, 2016 • edited Loading

apparentlymart commented Nov 30, 2016

stack72 commented Dec 5, 2016

ghost commented Apr 19, 2020

apparentlymart commented Sep 10, 2016 •

edited

Loading

apparentlymart commented Oct 30, 2016 •

edited

Loading

apparentlymart commented Nov 30, 2016 •

edited

Loading