-
Notifications
You must be signed in to change notification settings - Fork 49
Adds management of ssh_known_hosts file proposal #643
Adds management of ssh_known_hosts file proposal #643
Conversation
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
docs/proposals/ssh-known_hosts.rst
Outdated
| public_key | ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBPF+xkIGMUNVI0gElRaTLjfA4QMN/XGJhHswDyv59DNSOtG3KwZvDF3YkAb0PkTQAYo8N5fxoKqimGugOAaefPc= | | ||
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ||
|
||
The population of this tag will occur during a Terraform apply in which the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't tarmak also use 'autoscaling groups' to autoprovision instances? How will these have the tag populated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you're quite right, this is a major problem...
Originally I thought of having Wing in charge of tagging it's own instance but that's not great keeping the instance with that power. It is possible to create a Cloud Watch Event that can call a lamda function that would revoke the privilege after it's first call. That is a lot of moving parts though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the tricky bit here. In theory the nodes could self update it's tags. But I think the AWS IAM Policies can't be configured that tight to allow only self update. This also has some issues, once the node is overtaken and other tags (e.g. tarmak_role) are no longer safe.
Another problem is that the amount of tags to be stored on an AWS instance cannot exceed 255 chars. Also we have more than just the ecdsa host key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we have to look further than AWS. Not every provider will have support for tags. I think we should come up with a universal solution for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree, IMO taging instances on AWS would be the best way to couple the public keys with the instance life cycle. If we keep the SSH config with,
HostKeyCallback: ssh.tarmak.Provider().HostKeyCallBack,
then each provider can have it's specific implementation.
This is assuming all targeted providers will have some kind of instance metasdata we can pull and push from
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the comments suggest, terraform is not the right place to update host keys per instance as of the behaviour of ASG groups.
Furthermore I don't think we should update the tags directly from the instance itself, as it might have other implications (e.g. retagging workers to masters)
I would suggest we investigate how we could do that setup using lambda and format the tags in a way that we break it up according to the aws limits
/assign @JoshVanL
/unassign
docs/proposals/ssh-known_hosts.rst
Outdated
| public_key | ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBPF+xkIGMUNVI0gElRaTLjfA4QMN/XGJhHswDyv59DNSOtG3KwZvDF3YkAb0PkTQAYo8N5fxoKqimGugOAaefPc= | | ||
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ||
|
||
The population of this tag will occur during a Terraform apply in which the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the tricky bit here. In theory the nodes could self update it's tags. But I think the AWS IAM Policies can't be configured that tight to allow only self update. This also has some issues, once the node is overtaken and other tags (e.g. tarmak_role) are no longer safe.
Another problem is that the amount of tags to be stored on an AWS instance cannot exceed 255 chars. Also we have more than just the ecdsa host key.
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
/unassign |
TBH I don't like the solution with the lambda function. This is really AWS specific and I know we can always have a specific implementation. But IMHO this is reinventing the wheel. With this we would end up with a solution that works everywhere. Some more information: https://jameshfisher.com/2018/03/16/how-to-create-an-ssh-certificate-authority.html And after talking quickly with @JoshVanL this will not works as well as we connect to Vault with SSH :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/proposals/ssh-known_hosts.rst
Outdated
following: | ||
|
||
+-------------------------+---------------------------------------------------------------------------+ | ||
| PublicKey_ssh-ed25519_0 | AAAAC3NzaC1lZDI1NTE5AAAAIE90XYYm6GSDlNGejM+aY5dZEe5vK4XyU++89WdGJcDc==EOF | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't really like the format here: maybe we do something like that:
tarmak.io/ssh-host-ed255519-pub-0
tarmak.io/ssh-host-ecdsa-pub-0
tarmak.io/ssh-host-rsa-pub-0
I suggest also we don't worry about dsa
docs/proposals/ssh-known_hosts.rst
Outdated
file management. | ||
|
||
In order to create a source of truth for each host's public key, each instance | ||
will have it's public key's attached as a tags, shortly after boot time like the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also some considerations regarding if we can make sure terraform is not removing those tags every time we run it.
This is esp important for aws_instance resources. Maybe https://www.terraform.io/docs/configuration/resources.html#ignore_changes is good enough
via an Amazon Auto Scaling Group. At execution time, Wing - present on every | ||
instance - will invoke an Amazon Lambda function for Instance Tagging. Passed to | ||
this function will be a collection of the instances public keys, it's Amazon | ||
identity document and matching PKCS7 document. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the instance document provides some assurances that it is an actual instance from AWS we should not done the limitations of that in the proposal:
- There is no cryptographic signature that the public keys come actually from the host that is sending the request
- The lambda function has all the information acting like the instance to someone else with that document (e.g. logging into vault with the same method).
The instance document could be coming form another instance and
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
Signed-off-by: JoshVanL <vleeuwenjoshua@gmail.com>
/unassign |
Thank you josh. I am having some issues with it's portability as well (concerns Mattias mentioned), but I can't really see any short term work arounds. Esp at bootstrap it's really hard to do that correct /approve |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: simonswine The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
As we are replacing the ssh command line tool for all programmatic use cases in favor of an in package solution, we have discovered some problems managing the ssh_known_hosts file correctly. This proposal aims to address these.