Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a security checklist for clusters #33992

Merged
merged 16 commits into from
Sep 1, 2022

Conversation

mtardy
Copy link
Member

@mtardy mtardy commented May 27, 2022

Addresses issue Create a security checklist for deploying a cluster #28 in k/sig-security repository. [preview]

Draft for the security checklist guide from the collaborative document with @savitharaghunathan, @Skybound1, and @p4ck3t0. Many people participated in this list via this PR, thanks everyone ☺️!

This is a document in the form of a checklist with paragraphs that further describe some items. The goal is to centralize many security concerns and form a central place to redirect to other documentation or guide.

This checklist is meant to evolve in the future, as the participation on the PR proved, people provided really diverse ideas and this is essential to try to cover as much as possible.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 27, 2022
@k8s-ci-robot k8s-ci-robot added language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. labels May 27, 2022
@mtardy
Copy link
Member Author

mtardy commented May 27, 2022

/sig security

@k8s-ci-robot k8s-ci-robot added the sig/security Categorizes an issue or PR as relevant to SIG Security. label May 27, 2022
@netlify
Copy link

netlify bot commented May 27, 2022

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit 9f5a359
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-io-main-staging/deploys/63107f40742aaf00098fc4e7
😎 Deploy Preview https://deploy-preview-33992--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@rschosser
Copy link
Contributor

rschosser commented May 30, 2022

Great security checklist already!

But it could make sense to add the following additional information:

Network security

  • Restrict Internet access to the Kubernetes API server
  • Restrict access from pods to the Cloud metadata API (169.254.196.254), if a Cloud provider is used for hosting Kubernetes.

Memory and CPU limits

Pod security

Also in regard to the Secrets section, I think secrets should not be mounted as environment variables, because in case of an error, those could be written into a log and therefore be exposed.

@mtardy
Copy link
Member Author

mtardy commented May 30, 2022

Thanks a lot for the great inputs! Could you consider suggesting changes directly in the checklist? Otherwise, I'll add your ideas if you prefer 😊

@rschosser
Copy link
Contributor

I added my feedback directly. - Please feel free to change my suggestions in any way you want.

@p4ck3t0
Copy link
Contributor

p4ck3t0 commented May 31, 2022

Should we also add a section about certain admission controlers that can enhance security?

@mtardy
Copy link
Member Author

mtardy commented May 31, 2022

Should we also add a section about certain admission controlers that can enhance security?

You can totally propose another section if you have any ideas and it's relevant, we just have to keep third party links out of this checklist. So you have to keep generic or find some resources or list of stuff already available by the CNCF for example.

@p4ck3t0
Copy link
Contributor

p4ck3t0 commented May 31, 2022

Should we also add a section about certain admission controlers that can enhance security?

You can totally propose another section if you have any ideas and it's relevant, we just have to keep third party links out of this checklist. So you have to keep generic or find some resources or list of stuff already available by the CNCF for example.

I have written my suggestion as comment. Its a short section 😃

@cailynse
Copy link
Contributor

cailynse commented Jun 1, 2022

This looks really good, and is quite informative. One minor thought is that doesn't read so much like a checklist to me, more as a set of recommendations/considerations.

What are folks thoughts on trying to add a condensed list of actionable items? Would it be valuable?

@sftim
Copy link
Contributor

sftim commented Jun 2, 2022

condensed list of actionable items

Provided folks are happy to maintain it, we can also publish a PDF version of same. Or YAML, etc.

Copy link

@NissesSenap NissesSenap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that the issue says that we should use third party links but things about CSI benchmark is probably a nice thing to add.

@mtardy
Copy link
Member Author

mtardy commented Sep 1, 2022

For production clusters deployed using kubeadm, do we think that /etc/kubernetes/admin.conf is a risk or not?

I don't think I'm familiar enough with bootstrapping clusters to answer that question, unfortunately :(!

@savitharaghunathan
Copy link
Member

Hi Folks, can we merge this PR if there are no big concerns from the reviewers and capture the nice to haves or minor changes in another issue/PR?

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes since #33992 (comment) are minor. https://deploy-preview-33992--kubernetes-io-main-staging.netlify.app/docs/concepts/security/security-checklist/ looks good.

/lgtm
/approve

Many thanks to the folks who have moved this PR forward.


I've added some comments. None of this feedback in any way blocks us merging this PR.

- [ ] Intermediate and leaf certificates have an expiry date no more than 3
years in the future.
- [ ] A process exists for periodic access review, and reviews occur no more
than 24 months apart.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's OK for the checklist to recommend that you've thought about those expiry dates - there's no specific right answer for how long those intervals might be.

Comment on lines +44 to +48
After bootstrapping, neither users nor components should authenticate to the
Kubernetes API as `system:masters`. Similarly, running all of
kube-controller-manager as `system:masters` should be avoided. In fact,
`system:masters` should only be used as a break-glass mechanism, as opposed to
an admin user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could make a similar comment about /etc/kubernetes/admin.conf


## Network security

- [ ] CNI plugins in-use supports network policies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [ ] CNI plugins in-use supports network policies.
- [ ] CNI plugins in-use supports [network policies](https://kubernetes.io/docs/concepts/services-networking/network-policies/).

Comment on lines +72 to +73
feature, an alternative solution could be to use a service mesh to provide that
functionality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optionally:

Suggested change
feature, an alternative solution could be to use a service mesh to provide that
functionality.
feature, an alternative solution could be to use a [service mesh](https://glossary.cncf.io/service-mesh/)
to provide that functionality.

Comment on lines +90 to +92
If a cloud provider is used for hosting Kubernetes, the access from pods to the cloud
metadata API `169.254.169.254` should also be restricted or blocked if not needed
because it may leak information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider blocking egress to 169.254.0.0/16 and making exceptions based on business need.

(For IPv6, link-local addresses have an attached interface scope which means that even node-local routes don't get set up unless somebody explicitly makes this happen, so I think we don't need to mention IPv6).


- [ ] RBAC rights to `create`, `update`, `patch`, `delete` workloads is only granted if necessary.
- [ ] Appropriate Pod Security Standards policy is applied for all namespaces and enforced.
- [ ] Memory limit is set for the workloads with a limit equal or inferior to the request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't set a resource limit that is less than the request.

Suggested change
- [ ] Memory limit is set for the workloads with a limit equal or inferior to the request.
- [ ] Memory limit is set for the workloads with a request equal to or less
than the limit.

Comment on lines +162 to +163
For historical context, please note that Docker has been using
[a default seccomp profile](https://docs.docker.com/engine/security/seccomp/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For historical context, please note that Docker has been using
[a default seccomp profile](https://docs.docker.com/engine/security/seccomp/)
For historical context, note that Docker Engine has been using
[default seccomp profile](https://docs.docker.com/engine/security/seccomp/)

[a default seccomp profile](https://docs.docker.com/engine/security/seccomp/)
to only allow a restricted set of syscalls since 2016 from
[Docker Engine 1.10](https://www.docker.com/blog/docker-engine-1-10-security/),
but Kubernetes is still not confining workloads by default. The default seccomp
Copy link
Contributor

@sftim sftim Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
but Kubernetes is still not confining workloads by default. The default seccomp
but Kubernetes does not confine workloads by default. The default seccomp

Comment on lines +168 to +170
as well. Fortunately, [Seccomp Default](/blog/2021/08/25/seccomp-default/), a
new alpha feature to use a default seccomp profile for all workloads can now be
enabled and tested.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
as well. Fortunately, [Seccomp Default](/blog/2021/08/25/seccomp-default/), a
new alpha feature to use a default seccomp profile for all workloads can now be
enabled and tested.
as well.

and then in the What's Next section, link to https://k8s.io/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads

Comment on lines +283 to +288
Container image should contain the bare minimum to run the program they
package. Preferably, only the program and its dependencies, building the image
from the minimal possible base. In particular, image used in production should not
contain shells or debugging utilities, as an
[ephemeral debug container](/docs/tasks/debug/debug-application/debug-running-pod/#ephemeral-container)
can be used for troubleshooting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should explain and recommend multistage builds as one technique that helps minimize what gets included.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 1, 2022
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: e60390e52d7e7f5218be49460fd8d7bf739c8f63

@sftim
Copy link
Contributor

sftim commented Sep 1, 2022

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Sep 1, 2022
@sftim sftim removed the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Sep 1, 2022
@sftim
Copy link
Contributor

sftim commented Sep 1, 2022

Actually, I think a normal merge commit will be OK.

@PushkarJ
Copy link
Member

PushkarJ commented Sep 1, 2022

/lgtm

@sftim
Copy link
Contributor

sftim commented Sep 1, 2022

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sftim

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 1, 2022
@k8s-ci-robot k8s-ci-robot merged commit a5e96bf into kubernetes:main Sep 1, 2022
sig-security-tracker automation moved this from Ready for Co-Chair Review to Done Sep 1, 2022
@mtardy
Copy link
Member Author

mtardy commented Sep 1, 2022

Hehe, it feels almost a bit weird that this is finally merged! 😵‍💫 Thanks so much everybody, we were talking about that with sig security doc just a few minutes before!

Actually, I think a normal merge commit will be OK.

Can I ask why you did not squash merged finally @sftim?

Your comments were pretty nice btw, should we create a PR for those?

@sftim
Copy link
Contributor

sftim commented Sep 2, 2022

When GitHub does a squash merge it produces a single commit into the target branch, and doesn't let you track what the merge base of the work was.

This way, we're tracking where the branches diverged before merging back in.

@sftim
Copy link
Contributor

sftim commented Sep 2, 2022

Your comments were pretty nice btw, should we create a PR for those?

How about creating some issues? Where appropriate, mark those as good-first-issue so that folks with less experience can pick these up.

@sftim
Copy link
Contributor

sftim commented Sep 2, 2022

(Anyone can create those issues - doesn't need to be @mtardy)

If creating issues, mention #33992 so that we can see the provenance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/security Categorizes an issue or PR as relevant to SIG Security. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Development

Successfully merging this pull request may close these issues.

None yet