Skip to content
This repository has been archived by the owner on Jan 9, 2023. It is now read-only.

Replace exec with pkg ssh #635

Merged
merged 8 commits into from
Jan 31, 2019
Merged

Conversation

JoshVanL
Copy link
Contributor

@JoshVanL JoshVanL commented Nov 20, 2018

Use a ssh package to replace exec for all ssh connections including tunnel.

Has fully featured ssh shell.
Tunneling for commands and interactive shell.
New hidden sub-command tarmak tunnel that opens a tunnel to a specified destination+port and local port that exits after 10 mins.
Persistent tunnel through an exec of tarmak tunnel which is orphaned.

fixes #634
rebased on top of #664

Use in package ssh rather than exec

@jetstack-bot jetstack-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 20, 2018
@jetstack-bot jetstack-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 23, 2018
@JoshVanL JoshVanL force-pushed the use-crypto-ssh branch 3 times, most recently from 69fd5ad to b1432ab Compare November 23, 2018 13:45
@JoshVanL JoshVanL changed the title WIP: Replace exec with pkg ssh Replace exec with pkg ssh Nov 23, 2018
@jetstack-bot jetstack-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 23, 2018
@JoshVanL
Copy link
Contributor Author

/unassign
/assign @simonswine
Seems to be working quite well. Think this should be ready for some review now

@jetstack-bot jetstack-bot assigned simonswine and unassigned JoshVanL Nov 23, 2018
@simonswine simonswine added this to the release-0.6 milestone Nov 26, 2018
@simonswine simonswine added this to Needs review in release-0.6 Nov 26, 2018
Copy link
Contributor

@simonswine simonswine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I mentioned my concerns with replacing ssh through a command in the catch-up call:

tarmak cluster ssh might be used in ways we never planned for. It is actually quite debug-able by the average Linux admin (-vvv, change config,. ..). While if a library fails you are lost. I think using golang's library is fine and fair enough for the terraform provider, tunneling use cases.

I am suggesting:

  • We offer an option to run tarmak cluster ssh via golang package, but by default we still are using ssh passthrough
  • All programmatic access/use of ssh is used through golang package
  • We have to write a proposal how we fix the public key issue. Not checking at all is worse what we used to do
  • We should keep the bastion connetion open all the time and use a connection multiple to tunnel to Kube/Vault APIs and SSH of nodes

/assign @JoshVanL
/unassign

pkg/tarmak/ssh/ssh.go Outdated Show resolved Hide resolved
@jetstack-bot jetstack-bot assigned JoshVanL and unassigned simonswine Nov 28, 2018
@jetstack-bot jetstack-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 28, 2018
@jetstack-bot jetstack-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 29, 2018
@jetstack-bot jetstack-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 7, 2018
@jetstack-bot jetstack-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 21, 2019
@jetstack-bot jetstack-bot assigned simonswine and unassigned JoshVanL Jan 24, 2019
@simonswine
Copy link
Contributor

I struggle to build a working cluster from this:

I am getting those errors:

WARN[0349] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-3.tarmak.local
WARN[0349] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-1.tarmak.local
WARN[0349] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-2.tarmak.local
WARN[0352] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-3.tarmak.local
WARN[0352] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-1.tarmak.local
WARN[0352] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-2.tarmak.local
WARN[0355] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-3.tarmak.local
WARN[0355] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-1.tarmak.local
WARN[0356] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-2.tarmak.local
WARN[0358] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-1.tarmak.local
WARN[0359] failed to create tunnel to remote connection: ssh: rejected: connect failed (Connection refused)  app=tarmak destination=vault-2.tarmak.local

And apply gets stuck here:

DEBU[1009] module.kubernetes.tarmak_vault_cluster.vault: Still creating... (11m0s elapsed)  app=tarmak module=terraform std=out
DEBU[1019] module.kubernetes.tarmak_vault_cluster.vault: Still creating... (11m10s elapsed)  app=tarmak module=terraform std=out
DEBU[1029] module.kubernetes.tarmak_vault_cluster.vault: Still creating... (11m20s elapsed)  app=tarmak module=terraform std=out
DEBU[1039] module.kubernetes.tarmak_vault_cluster.vault: Still creating... (11m30s elapsed)  app=tarmak module=terraform std=out
DEBU[1049] module.kubernetes.tarmak_vault_cluster.vault: Still creating... (11m40s elapsed)  app=tarmak module=terraform std=out
DEBU[1059] module.kubernetes.tarmak_vault_cluster.vault: Still creating... (11m50s elapsed)  app=tarmak module=terraform std=out

My ssh-known-hosts was empty before and now looks like that:

8.202.74.208 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBOZHQeWBspyzgPH5ogfvXTZH2WZKbI5uSeb+kEinDKcnkOrIrnbz2vtJs5v+e5cDHHgMLozT3H/tW1ao+XC40uQ=
10.99.32.10 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFxR/GnZg1xfFb077yBTKieoIe5dPtwojfoc5pvSvspW
10.99.64.10 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCH3wrOtgH/bcZHQLukE056BmbTS77FgDWMx/QGfDVhBvk4p02kpFoU568NBvIJyZ+CUvdmaeiR83LrLl6pH7E4=
10.99.96.10 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDEYzZTMZSCB4u4H6ihVDR5D3dORWPvCVFx9CA7i4vw6JDpMqXQAs44o5c35Ok6SpVxnekYTgKRurffk9ydXtc9J1WfR6M6tMPs8NxnZkYYkYvZsr/J8zXVmYt+JxBrxJ/T2tlHkUVaz10RyU+vNsOC1xn3fnSLuFNI6trRNHJq+dJV5LWF/ZO+n9mhbp1gCCzaWGPKWcHHvjVtRbAilacYZF2mjnVe7tuKWQpUry+ydkgvW+aEG5AC9QcGUcpPzo+SrYPrMYQNDS3kU8DIaYM1qZC7L0GXOqOUsHxmUI7DLIvXw3IEq8a+5am47nlkCXIgehjdXjPLvk8FickyxUZ3

Run: func(cmd *cobra.Command, args []string) {
t := tarmak.New(globalFlags)
defer t.Cleanup()
t.SSHPassThrough(args)
t.Perform(t.SSHPassThrough(args[0], args[1:]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still ssh CLI powered instead of golang native SSH package.

The following seems to break (Which is really useful for debugging):

tarmak-0.5.2 cluster ssh bastion -vvv
OpenSSH_7.6p1 Ubuntu-4ubuntu0.1, OpenSSL 1.0.2n  7 Dec 2017
debug1: Reading configuration data /home/christian/.tarmak-dev/csire-cluster/ssh_config
debug2: resolving "bastion" port 22
ssh: Could not resolve hostname bastion: No address associated with hostname
FATA[0000] exit status 255 

On master this looks like:

./tarmak_linux_amd64 cluster ssh bastion -vvv
ssh: Could not resolve hostname bastion: No address associated with hostname
FATA[0000] exit status 255

@simonswine
Copy link
Contributor

I suspect we want to store all host keys per instance in the known hosts file, instead of only a signle random one. That might also cause my issues

/assign @JoshVanL
/unassign

@jetstack-bot jetstack-bot assigned JoshVanL and unassigned simonswine Jan 28, 2019
for _, host := range hosts {
// TODO: do the strict checking settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoshVanL can you take a look at that as well

@simonswine
Copy link
Contributor

Ok I have pushed my changes now. I think we need to make Vault Create Cluster fail hard once all three vault are not reachable:

Err: vault1/2/3 not reachable through bastion after 5 tries (sorry don't have a real output right now)

And obviously try to connect more often with longer timeouts.

I am also not happy seeing a lot of expected warning and errors for unreachable bastion and failed vault tunnels. We should not show them at all as long as we get the connection at some point

@simonswine
Copy link
Contributor

Over to you now @JoshVanL

JoshVanL and others added 5 commits January 29, 2019 11:50
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
Signed-off-by: Christian Simon <simon@swine.de>
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
@jetstack-bot jetstack-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 30, 2019
@JoshVanL
Copy link
Contributor Author

/unassign
/assign @simonswine

@jetstack-bot jetstack-bot assigned simonswine and unassigned JoshVanL Jan 30, 2019
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
@simonswine
Copy link
Contributor

Thank you, this is quite stable for me

/lgtm

@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 31, 2019
@jetstack-bot jetstack-bot merged commit 1e3993d into jetstack:master Jan 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/terraform Indicates a PR is affecting Terraform configuration dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. kind/documentation Categorizes issue or PR as related to documentation. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
No open projects
release-0.6
  
Needs review
Development

Successfully merging this pull request may close these issues.

Use ssh package instead of a forked cmd exec
3 participants