Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to RBAC #57

Closed
markmandel opened this issue Jan 11, 2018 · 16 comments
Closed

Switch to RBAC #57

markmandel opened this issue Jan 11, 2018 · 16 comments
Assignees
Labels
area/security Issues pertaining to security kind/design Proposal discussing new features / fixes and how they should be implemented
Milestone

Comments

@markmandel
Copy link
Member

markmandel commented Jan 11, 2018

At some point in the next few released of Kubernetes, the default support for allowing all level access to the entire cluster for all Pods will go away, so we should switch to RBAC at some point in the near future.

Also, this is better for security.

@markmandel markmandel added kind/design Proposal discussing new features / fixes and how they should be implemented area/security Issues pertaining to security labels Jan 11, 2018
@markmandel markmandel added this to the 0.1 milestone Jan 24, 2018
@markmandel
Copy link
Member Author

We should set the make minikube-test-cluster to enable RBAC when we have this implementation as well:
https://github.com/kubernetes/minikube/blob/master/docs/configuring_kubernetes.md#examples

@dzlier-gcp
Copy link
Contributor

RBAC in essence is much like IAM in that it defines Roles and RoleBindings, which describe what actions can be taken on what resources, and who can take those actions, respectively. There are also ClusterRoles and ClusterRoleBindings, which apply across and entire cluster instead of being limited to a specific namespace.

Given that, in thinking about what kind of default Roles and RoleBindings we should provide, and which we should instruct users to implement.

I think a good place to start would be to define read-only and admin Roles and ClusterRoles to GameServers & GameServer pods. We can develop more Roles/ClusterRoles based on common use cases moving forward.

@markmandel
Copy link
Member Author

(Assuming I get RBAC right)

I feel like we would need 2 Roles

  1. For the controller - who would likely need
  • read: CustomResourceDefinitions
  • read/write GameServers, Pods
  1. For the sidecar for the GameServer Pod
  • read/write on the GameServers

I'm not sure how you apply RBAC roles to running containers however.

Does that seem right to you?

@alexandrem
Copy link

What is the flow of information between kube api and the game server pod (sidecar)?

We should probably aim for least privilege. We don't want a compromised game server to be able to fetch information of all other game servers.

Kubernetes RBAC is resource oriented, meaning that a service account with read access to the game server resource is able to list/get all of them. So if sidecar has a service account token to read all that data, then the game binary can send RPC commands to retrieve that as well.

If sidecar only need to read some data about the current game instance, then a config map mounted to sidecar pod might be a better solution.

@dzlier-gcp
Copy link
Contributor

dzlier-gcp commented Jan 28, 2018 via email

@markmandel
Copy link
Member Author

markmandel commented Jan 28, 2018

So the sidecar needs to read the information of the GameServer it's with, and update it's Status.State as well (need to read to do the update) - basically the state of the GameServer (through master) is the communication mechanism between the controller and the sidecar.

I take it there is no way to limit access to the specific GameServer, but is there a way to only enable access to the Kubernetes API from the sidecar, and not the GameServer container?

@dzlier-gcp
Copy link
Contributor

dzlier-gcp commented Jan 29, 2018 via email

@alexandrem
Copy link

alexandrem commented Jan 29, 2018

AFAIK the service accounts are defined at pod level only. The same service account token will be distributed to all pod containers on their filesystem in /var/run/secrets/kubernetes.io/serviceaccount/.

Do we actually need to use a "push" model to update the game server pod metadata?

It seems that traditionally, when we compare with other Kubernetes CRD controllers, it's the other way around.
For instance, etcd-operator will loop through all running etcd clusters defined and poll each of the individual cluster member, then update the cluster member list and phase accordingly.

Following that approach would mean that only control plane components, like agon controller, need to have special service accounts with read/write to the CRD.

The sidecar could expose some "/health" endpoint and the controller would periodically fetch this to get current state. Something akin to the prometheus /metrics endpoint pulls during scaping.

We could probably even do a spec.automountServiceAccountToken = false to prevent all interaction with the kube api at all from the game server pods.

@alexandrem
Copy link

Example of the etcd-operator run loop that runs in a separate goroutine for each cluster resource:

https://github.com/coreos/etcd-operator/blob/534a00a970975a66b15e2ea3cd90eb6d5104c71b/pkg/cluster/cluster.go#L189-L283

@markmandel
Copy link
Member Author

@alexandrem there is a centralised controller - you can see it here:
https://github.com/googleprivate/agon/tree/master/gameservers/controller
This controls Pods allocation and the majority of the life cycle of the GameServer.

However, we needed a way to do simple communicating from the sidecar back to the controller because of the SDK. So the sidecar does this by updating the Status.State value on the GameServer. To be 100% clear - the sidecar does not do anything to do with allocating pods or managing the life cycle of the GameServer. You can see the code here:
https://github.com/googleprivate/agon/tree/master/gameservers/sidecar

Doing the communication this way is nice, because we don't have to build and test our own scaling mechanism for communication from sidecar <--> controller - there is a already a lot of battle tested work on scaling up the K8s master as the size of your cluster and the workload increases.

That being said - @dzlier-gcp and I are discussing if it's possible for the controller to dynamically create a role for each GameServer so each Pod can only access and edit the information of the GameServer that it belongs to, because we don't want to have a GameServer to be able to access all information about all GameServers - or edit their information.

@markmandel
Copy link
Member Author

markmandel commented Jan 30, 2018

Chatting with @dzlier-gcp right now. We're going to look at doing this in a phased approach.

To start - do it a little more open than ideal, but gets us moving forward:

  1. The controller has access to basically do all the things to GameServers and Pods (get, update, delete, list). It may also need read access to CRDs, as it waits for the GameServer CRD to move to established state before doing any work. This may not be necessary long term.
  2. The sidecar only has access to get and update the GameServer (it has no reason to list)

Second phase would be that, depending on how that goes, (and how fast subresources comes out), we remove access to update on the GameServer for the sidecar, and we create a separate CRD (GameServerState?) that the sidecar can create with the name of the GameServer and the update state required. The controller will look out for the creation of this CRD, which would be used to communicate to the that a state change has been requested - which would also limit the amount of damage a comprimised gameserver could potentially do as well to a much smaller surface area.

@dzlier-gcp did I miss anything?

@ianlewis
Copy link

ianlewis commented Feb 2, 2018

Did you give any thought to built-in objects? Does the controller or sidecar need access to Secrets or ConfigMaps? Pods?

@markmandel
Copy link
Member Author

@ianlewis Neither the sidecar or the controller needs to directly create Secrets or ConfigMaps. The controller definitely needs access to Pods (see my comment above), but outside of that, I don't think there are other built in objects that will require direct access to.

I currently don't see any reason why this would change would going forward, but if it shows up, we can cross that bridge when we come to it.

@markmandel
Copy link
Member Author

@dzlier-gcp Just a heads up, I just did a big refactor, so you may want to delete the old branch and start anew, just to avoid merge conflicts.

@dzlier-gcp
Copy link
Contributor

dzlier-gcp commented Feb 6, 2018 via email

@dzlier-gcp
Copy link
Contributor

Woo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security Issues pertaining to security kind/design Proposal discussing new features / fixes and how they should be implemented
Projects
None yet
Development

No branches or pull requests

4 participants