Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With symlinked home directory, init fails: providers could not be installed: lstat var: no such file or directory #25367

Open
mildred opened this issue Jun 24, 2020 · 17 comments
Labels
upstream v0.13 Issues (primarily bugs) reported against v0.13 releases

Comments

@mildred
Copy link
Contributor

mildred commented Jun 24, 2020

After following guide at https://github.com/hashicorp/terraform/blob/guide-v0.13-beta/draft-upgrade-guide.md#in-house-providers I failed to make terraform recognize in-house providers

I created a terraform-provider-sys which is a fork of terraform-provider-null with added features for local provisioning, but I failed to make it work with terraform 0.13-beta2. With terraform-0.12 I installed the provider manually under ~/.terraform.d/plugins/terraform-provider-sys and I could then use it.

With terraform 0.13 I modified my module accordingly:

terraform {
  required_providers {
    sys = {
      source = "terraform.localhost/local/sys"
    }
  }
  required_version = ">= 0.13"
}

I chose the terraform.localhost domain because I obviously control it and the doc explicitely says that I do not need to run a registry at that address:

If you wish, you can later run your own Terraform provider registry at the specified hostname as an alternative to local installation, without any further modifications to the above configuration. However, we recommend tackling that only after your initial upgrade using the new local filesystem layout.

I put my provider at: ~/.terraform.d/plugins/terraform.localhost/local/sys/terraform-provider-sys but I get the following error:

$ terraform init

Initializing the backend...

Initializing provider plugins...
- Using previously-installed hashicorp/random v2.2.1
- Finding latest version of terraform.localhost/local/sys...

Error: Failed to query available provider packages

Could not retrieve the list of available versions for provider
terraform.localhost/local/sys: could not connect to terraform.localhost:
Failed to request discovery document: Get
"https://terraform.localhost/.well-known/terraform.json": dial tcp: lookup
terraform.localhost on 192.168.5.1:53: no such host

I also tried with ~/.terraform.d/plugins/terraform.localhost/local/sys/v1.0.0/linux_amd64/terraform-provider-sys_v1.0.0 with no better results.

This should be made to work, or if it works already, the documentation should be clearer with examples. There is the command terraform providers mirror which provides an example layout but it contains various JSON files, are we expected to create those JSON files?

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

Actually I could progress a step further by looking at the TF_LOG=TRACE output. The correct path seems to be something else and the guide is wrong:

-terraform.example.com/awesomecorp/happycloud/v1.0.0/linux_amd64/terraform-provider-happycloud_v1.0.0
+terraform.example.com/awesomecorp/happycloud/1.0.0/linux_amd64/terraform-provider-happycloud_v1.0.0

I put my provider at ~/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64/terraform-provider-sys_v1.0.0 (and changed the provider source to localhost/local/sys for simplicity) and I got the following error instead:

Initializing provider plugins...
- Using previously-installed localhost/local/sys v1.0.0
- Using previously-installed hashicorp/random v2.2.1

Error: some providers could not be installed:
- localhost/local/sys: failed to calculate checksum for installed provider localhost/local/sys package: lstat var: no such file or directory

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

stracing terraform I found

[pid 947085] newfstatat(AT_FDCWD, "var", 0xc000586c68, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)

But I have absolutely no idea why it is looking up for this non existent file

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

The source of the error seems to be:

packageDir, err := filepath.EvalSymlinks(string(loc))

as if loc contained "var"

called from:

return getproviders.PackageHash(cp.PackageLocation())

called from:

hash, err := cached.Hash()
if err != nil {
errs[provider] = fmt.Errorf("failed to calculate checksum for installed provider %s package: %s", provider, err)

as if cached.PackageDir contained "var" (I have no file named var on my homedir or on the module path)

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

cached := i.targetDir.ProviderVersion(provider, version)

cached is generated here:

func (d *Dir) ProviderVersion(provider addrs.Provider, version getproviders.Version) *CachedProvider {
if err := d.fillMetaCache(); err != nil {
return nil
}
for _, entry := range d.metaCache[provider] {
// We're intentionally comparing exact version here, so if either
// version number contains build metadata and they don't match then
// this will not return true. The rule of ignoring build metadata
// applies only for handling version _constraints_ and for deciding
// version precedence.
if entry.Version == version {
return &entry
}
}
return nil
}

which is generated using the fillMeta function:

func (d *Dir) fillMetaCache() error {
// For d.metaCache we consider nil to be different than a non-nil empty
// map, so we can distinguish between having scanned and got an empty
// result vs. not having scanned successfully at all yet.
if d.metaCache != nil {
log.Printf("[TRACE] providercache.fillMetaCache: using cached result from previous scan of %s", d.baseDir)
return nil
}
log.Printf("[TRACE] providercache.fillMetaCache: scanning directory %s", d.baseDir)
allData, err := getproviders.SearchLocalDirectory(d.baseDir)
if err != nil {
log.Printf("[TRACE] providercache.fillMetaCache: error while scanning directory %s: %s", d.baseDir, err)
return err
}
// The getproviders package just returns everything it found, but we're
// interested only in a subset of the results:
// - those that are for the current platform
// - those that are in the "unpacked" form, ready to execute
// ...so we'll filter in these ways while we're constructing our final
// map to save as the cache.
//
// We intentionally always make a non-nil map, even if it might ultimately
// be empty, because we use that to recognize that the cache is populated.
data := make(map[addrs.Provider][]CachedProvider)
for providerAddr, metas := range allData {
for _, meta := range metas {
if meta.TargetPlatform != d.targetPlatform {
log.Printf("[TRACE] providercache.fillMetaCache: ignoring %s because it is for %s, not %s", meta.Location, meta.TargetPlatform, d.targetPlatform)
continue
}
if _, ok := meta.Location.(getproviders.PackageLocalDir); !ok {
// PackageLocalDir indicates an unpacked provider package ready
// to execute.
log.Printf("[TRACE] providercache.fillMetaCache: ignoring %s because it is not an unpacked directory", meta.Location)
continue
}
packageDir := filepath.Clean(string(meta.Location.(getproviders.PackageLocalDir)))
execFile := findProviderExecutableInLocalPackage(meta)
if execFile == "" {
// If the package doesn't contain a suitable executable then
// it isn't considered to be part of our cache.
log.Printf("[TRACE] providercache.fillMetaCache: ignoring %s because it is does not seem to contain a suitable plugin executable", meta.Location)
continue
}
log.Printf("[TRACE] providercache.fillMetaCache: including %s as a candidate package for %s %s", meta.Location, providerAddr, meta.Version)
data[providerAddr] = append(data[providerAddr], CachedProvider{
Provider: providerAddr,
Version: meta.Version,
PackageDir: filepath.ToSlash(packageDir),
ExecutableFile: filepath.ToSlash(execFile),
})
}
}
// After we've built our lists per provider, we'll also sort them by
// version precedence so that the newest available version is always at
// index zero. If there are two versions that differ only in build metadata
// then it's undefined but deterministic which one we will select here,
// because we're preserving the order returned by SearchLocalDirectory
// in that case..
for _, entries := range data {
sort.SliceStable(entries, func(i, j int) bool {
// We're using GreaterThan rather than LessThan here because we
// want these in _decreasing_ order of precedence.
return entries[i].Version.GreaterThan(entries[j].Version)
})
}
d.metaCache = data
return nil
}

corresponding to the following TF_LOG=TRACE logs:

Initializing provider plugins...
2020/06/24 10:40:20 [TRACE] providercache.fillMetaCache: scanning directory .terraform/plugins
2020/06/24 10:40:20 [TRACE] getproviders.SearchLocalDirectory: found localhost/local/sys v1.0.0 for linux_amd64 at .terraform/plugins/localhost/local/sys/1.0.0/linux_amd64
2020/06/24 10:40:20 [TRACE] getproviders.SearchLocalDirectory: found registry.terraform.io/hashicorp/random v2.2.1 for linux_amd64 at .terraform/plugins/registry.terraform.io/hashicorp/random/2.2.1/linux_amd64
- Using previously-installed localhost/local/sys v1.0.0
- Using previously-installed hashicorp/random v2.2.1
2020/06/24 10:40:20 [TRACE] providercache.fillMetaCache: including .terraform/plugins/localhost/local/sys/1.0.0/linux_amd64 as a candidate package for localhost/local/sys 1.0.0
2020/06/24 10:40:20 [TRACE] providercache.fillMetaCache: including .terraform/plugins/registry.terraform.io/hashicorp/random/2.2.1/linux_amd64 as a candidate package for registry.terraform.io/hashicorp/random 2.2.1
2020/06/24 10:40:20 [TRACE] providercache.fillMetaCache: using cached result from previous scan of .terraform/plugins
2020/06/24 10:40:20 [TRACE] providercache.fillMetaCache: using cached result from previous scan of .terraform/plugins

Error: some providers could not be installed:
- localhost/local/sys: failed to calculate checksum for installed provider localhost/local/sys package: lstat var: no such file or directory

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

log.Printf("[TRACE] providercache.fillMetaCache: including %s as a candidate package for %s %s", meta.Location, providerAddr, meta.Version)
data[providerAddr] = append(data[providerAddr], CachedProvider{
Provider: providerAddr,
Version: meta.Version,
PackageDir: filepath.ToSlash(packageDir),
ExecutableFile: filepath.ToSlash(execFile),
})

is where the meta cache is filled, corresponding to the following trace:

2020/06/24 10:40:20 [TRACE] providercache.fillMetaCache: including .terraform/plugins/localhost/local/sys/1.0.0/linux_amd64 as a candidate package for localhost/local/sys 1.0.0
2020/06/24 10:40:20 [TRACE] providercache.fillMetaCache: including .terraform/plugins/registry.terraform.io/hashicorp/random/2.2.1/linux_amd64 as a candidate package for registry.terraform.io/hashicorp/random 2.2.1

PackageDir is set to filepath.ToSlash(packageDir)

packageDir is set in:

packageDir := filepath.Clean(string(meta.Location.(getproviders.PackageLocalDir)))

none of which are looking like they would return "var"

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

Something to add, it's running on a Fedora Silverblue machine as root user (for local provisioning), and /root is a symlink to var/roothome. This is where it might come from.

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

strace logs:

strace -ff env TF_LOG=TRACE terraform init
...
[pid 975373] readlinkat(AT_FDCWD, ".terraform/plugins/localhost/local/sys/1.0.0/linux_amd64",  <unfinished ...>
[pid 975373] <... readlinkat resumed>"/root/.terraform.d/plugins/local"..., 128) = 64
[pid 975373] newfstatat(AT_FDCWD, "/root",  <unfinished ...>
[pid 975373] <... newfstatat resumed>{st_mode=S_IFLNK|0777, st_size=12, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid 975373] readlinkat(AT_FDCWD, "/root",  <unfinished ...>
[pid 975373] <... readlinkat resumed>"var/roothome", 128) = 12
[pid 975373] newfstatat(AT_FDCWD, "var", 0xc000477898, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)

Seems like there is something here, and setting HOME=/var/roothome does not change a thing

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

As if filepath.ToSlash(filepath.Clean("/root/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64")) == "var"

With go version go1.13.6 linux/amd64 and the following program:

package main

import (
  "path/filepath"
  "fmt"
)

func main(){
  fmt.Println(filepath.ToSlash(filepath.Clean("/root/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64")))
}

Running on the same system, I do not get anything strange: /root/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64, strace logs does not contain any newfstatat or readlinkat, so it does not seem to come from the go runtime.

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

With the following test program:

package main

import (
  "path/filepath"
  "fmt"
)

func main(){
  fmt.Println(filepath.ToSlash(filepath.Clean("/root/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64")))
  fmt.Println(filepath.EvalSymlinks(filepath.ToSlash(filepath.Clean("/root/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64"))))
}

I see this in strace:

newfstatat(AT_FDCWD, "/root", {st_mode=S_IFLNK|0777, st_size=12, ...}, AT_SYMLINK_NOFOLLOW) = 0
readlinkat(AT_FDCWD, "/root", "var/roothome", 128) = 12
newfstatat(AT_FDCWD, "/var", {st_mode=S_IFDIR|0755, st_size=262, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "/var/roothome", {st_mode=S_IFDIR|0700, st_size=398, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "/var/roothome/.terraform.d", {st_mode=S_IFDIR|0755, st_size=86, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "/var/roothome/.terraform.d/plugins", {st_mode=S_IFDIR|0755, st_size=182, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "/var/roothome/.terraform.d/plugins/localhost", {st_mode=S_IFDIR|0755, st_size=10, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "/var/roothome/.terraform.d/plugins/localhost/local", {st_mode=S_IFDIR|0755, st_size=6, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "/var/roothome/.terraform.d/plugins/localhost/local/sys", {st_mode=S_IFDIR|0755, st_size=54, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "/var/roothome/.terraform.d/plugins/localhost/local/sys/1.0.0", {st_mode=S_IFDIR|0755, st_size=22, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "/var/roothome/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64", {st_mode=S_IFDIR|0755, st_size=64, ...}, AT_SYMLINK_NOFOLLOW) = 0
write(1, "/var/roothome/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64 <nil>\n", 79/var/roothome/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64 <nil>

Which indicates that it works in this case. I also see a similar pattern as terraform strace (newfstatat + readlinkat + newfstatat)

The symlink to /root is var/roothome without leading / and it seems the terraform version of filepath.EvalSymlinks gets confused by this.

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

using latest filepath library from go 294edb272d5d145665bdf8b4254609eae0363a8d, I get no error

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

Might be related: golang/go#30520

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

terraform v0.13.0-beta2 built with my toolchain does have the error, not a problem of the toolchain.

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

No, this is indeed a problem with the runtime. I added traces to terraform until I found that EvalSymlinks is called with .terraform/plugins/localhost/local/sys/1.0.0/linux_amd64 and errors out:

2020/06/24 15:44:44 [TRACE] PackageHashV1: EvalSymlinks(.terraform/plugins/localhost/local/sys/1.0.0/linux_amd64)

Error: some providers could not be installed:
- localhost/local/sys: failed to calculate checksum for installed provider localhost/local/sys package: failed to evaluate symlinks for local directory .terraform/plugins/localhost/local/sys/1.0.0/linux_amd64: lstat var: no such file or directory

and I could reproduce it with a trivial example:

package main

import (
  "path/filepath"
  "fmt"
)

func main(){
  fmt.Println("filepath.EvalSymlinks(.terraform/plugins/localhost/local/sys/1.0.0/linux_amd64)")
  fmt.Println(filepath.EvalSymlinks(".terraform/plugins/localhost/local/sys/1.0.0/linux_amd64"))
}

@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

Can be reproduced out of context with:

package main

import (
  "path/filepath"
  "fmt"
  "syscall"
  "os"
)

func main(){
  os.Mkdir("origin", 0777)
  os.Mkdir("origin/var", 0777)
  os.Mkdir("origin/var/home", 0777)
  os.Mkdir("origin/var/home/foo", 0777)
  os.Symlink("var/home", "origin/home")
  os.Symlink("/home/foo", "origin/home/bar")
  os.Chdir("origin")
  syscall.Chroot(".")
  os.Chdir("home")
  fmt.Println(filepath.EvalSymlinks("bar"))
}

@alisdair alisdair added the v0.13 Issues (primarily bugs) reported against v0.13 releases label Jun 24, 2020
@mildred
Copy link
Contributor Author

mildred commented Jun 24, 2020

Golang issue can be worked around by updating the symlink:

[root@elyas sd-addr.tf]# ls -l .terraform/plugins/localhost/local/sys/1.0.0/linux_amd64 /root
lrwxrwxrwx. 1 root root 64 24 juin  10:55 .terraform/plugins/localhost/local/sys/1.0.0/linux_amd64 -> /root/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64
lrwxrwxrwx. 7 root root 12 27 juin   2019 /root -> var/roothome

[root@elyas sd-addr.tf]# ln -sf /var/roothome/.terraform.d/plugins/localhost/local/sys/1.0.0/linux_amd64 .terraform/plugins/localhost/local/sys/1.0.0/linux_amd64

@alisdair alisdair changed the title 0.13-beta2 does not work with in-house providers With symlinked home directory, init fails: providers could not be installed: lstat var: no such file or directory Jun 26, 2020
@alisdair
Copy link
Contributor

Thank you @mildred for this excellent investigation! As you noted yourself, the initial problem with installing the local provider was a documentation bug (thanks for fixing that). I've retitled this issue and will leave it open to track the upstream symlink evaluation issue, in the hopes that your diagnosis and workaround will help anyone else who encounters it.

@mildred
Copy link
Contributor Author

mildred commented Jul 18, 2020

I was hinted by the go team that EvalSymlinks should not be used. It was even suggested for deletion, but they didn't do that to keep compatibility. The problems seems mostly on Windows though.

See: golang/go#40180

So, it might be a good idea to try to use something else.

Here is the documentation text that will go with the function:

// EvalSymlinks returns the path name after the evaluation of any symbolic
// links.
// If path is relative the result will be relative to the current directory,
// unless one of the components is an absolute symbolic link.
// EvalSymlinks calls Clean on the result.
// Use of this function is unsuitable for the vast majority of applications.
// This function always resolves all links on path, but links may have
// been created to solve administrative problems of which 
// most applications should remain unaware.
// Most applications should only resolve specific links that they 
// require to resolve,  use the result immediately, forget the result
// and never show the result to the user.
// On Windows, the result is not stable and must not be cached
// or stored in any way, resolving links may introduce complexities that 
// the application must be  prepared to deal with and the result 
// does not identify the volume that a file or directory resides on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream v0.13 Issues (primarily bugs) reported against v0.13 releases
Projects
None yet
Development

No branches or pull requests

2 participants