Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bare-metal multiple apply/ssh on Terraform v0.11.4+ #181

Merged
merged 2 commits into from
Apr 8, 2018

Conversation

dghubble
Copy link
Member

@dghubble dghubble commented Apr 8, 2018

  • Terraform v0.11.4 introduced changes to remote-exec that mean Typhoon bare-metal clusters require multiple runs of terraform apply to ssh and bootstrap.
  • Bare-metal installs PXE boot a live instance to install to disk and then reboot from disk as controllers/workers.Terraform remote-exec has no way to "know" to wait until the reboot has occurred to kickoff Kubernetes bootstrap. Previously Typhoon created a "debug" user during this install phase to allow an admin to SSH, but remote-exec would hang, trying to connect as user "core". Terraform v0.11.4 changes this behavior so remote-exec fails and a user must re-run terraform apply until succeeding.
  • A new way to "trick" remote-exec into waiting for the reboot into the disk install is to run SSH on a non-standard port during the disk install. This retains the ability for an admin to SSH during install (most distros don't have this) and fixes the issue so only a single run of terraform apply is needed.
  • Halt on provisioner errors hashicorp/terraform#17359 (comment)

@dghubble dghubble force-pushed the workaround-terraform-ssh-change branch 3 times, most recently from 8bc5494 to 982e3c2 Compare April 8, 2018 20:27
* Terraform v0.11.4 introduced changes to remote-exec
that mean Typhoon bare-metal clusters require multiple
runs of terraform apply to ssh and bootstrap.
* Bare-metal installs PXE boot a live instance to install
to disk and then reboot from disk as controllers/workers.
Terraform remote-exec has no way to "know" to wait until
the reboot has occurred to kickoff Kubernetes bootstrap.
Previously Typhoon created a "debug" user during this
install phase to allow an admin to SSH, but remote-exec
would hang, trying to connect as user "core". Terraform
v0.11.4 changes this behavior so remote-exec fails and
a user must re-run terraform apply until succeeding.
* A new way to "trick" remote-exec into waiting for the
reboot into the disk install is to run SSH on a non-standard
port during the disk install. This retains the ability
for an admin to SSH during install (most distros don't have
this) and fixes the issue so only a single run of terraform
apply is needed.
* hashicorp/terraform#17359 (comment)
@dghubble dghubble force-pushed the workaround-terraform-ssh-change branch from 982e3c2 to d276fff Compare April 8, 2018 20:32
@dghubble dghubble merged commit b8656fd into master Apr 8, 2018
@dghubble dghubble deleted the workaround-terraform-ssh-change branch April 8, 2018 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant