Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network Policy Recommendations #1344

Merged
merged 1 commit into from
Nov 15, 2023

Conversation

wallrj
Copy link
Member

@wallrj wallrj commented Nov 9, 2023

Preview: https://deploy-preview-1344--cert-manager-website.netlify.app/docs/installation/best-practice/#network-policy

I've tried to document all the network interactions of a typical cert-manager installation.
I've also added some (incomplete) Calico examples, but I'm not sure whether those make the document rather unreadable.
I'm inclined to remove the Calico examples for brevity.
I removed the Calico examples.

Fixes: #2334

xref:

@jetstack-bot jetstack-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 9, 2023
Copy link

netlify bot commented Nov 9, 2023

Deploy Preview for cert-manager-website ready!

Name Link
🔨 Latest commit a6fb75d
🔍 Latest deploy log https://app.netlify.com/sites/cert-manager-website/deploys/6554969494033f0008455762
😎 Deploy Preview https://deploy-preview-1344--cert-manager-website.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@wallrj wallrj force-pushed the network-policy branch 5 times, most recently from 48fec25 to b9efda8 Compare November 10, 2023 15:30
@jetstack-bot jetstack-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 10, 2023
@wallrj wallrj force-pushed the network-policy branch 3 times, most recently from debf35b to 983061f Compare November 10, 2023 17:50
@wallrj wallrj changed the title WIP: Network Policy Recommendations Network Policy Recommendations Nov 10, 2023
@jetstack-bot jetstack-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 10, 2023
@wallrj wallrj requested a review from hawksight November 10, 2023 17:54
@tspearconquest
Copy link
Contributor

tspearconquest commented Nov 10, 2023

This is great, however given that kind: NetworkPolicy exists in different API groups with different specs depending on which CNI you use, it would be great if the documentation could also include examples for multiple common CNIs like Azure and EKS (in addition to Calico).

Copy link
Member

@hawksight hawksight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor comments but would rather have this live and refine it.

Main one is that perhaps we should stick to default NetworkPolicy with networking.k8s.io? If we use one specific implementation (Calico), then we should provide others too?

I appreciate the list of communications. I wonder if there is an easier way to define them to be very clear? Possibly just having a YAML extract from the policy under each bullet?

content/docs/installation/best-practice.md Show resolved Hide resolved
The cert-manager controller has a metrics server which listens for HTTP connections on TCP port 9402.
Create a network policy which allows access to this service from your chosen metrics collector.

### Calico Example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use the standard networkpolicies.networking.k8s.io/v1 as the example, rather than a provider specific version?

Or we could have 3 example implementations?:

  • Standard NetworkPolicy
  • Calico
  • Cillium

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use the standard networkpolicies.networking.k8s.io/v1 as the example, rather than a provider specific version?

The standard Kubernetes NetworkPolicy doesn't allow you to refer to a Service, which is what I think is needed to reliably allow cert-manager Pods to connect to the Kubernetes API server, so I used Calico.

Alternative would be to document how to find the IP address(es) of your K8S API server replicas and hard code those in to a Kubernetes NetworkPolicy....but that seems likely to break if the addresses ever change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to reply but I see what you mean, you would have to have a combination of:

spec:
  egress:
    to:
      namespaceSelector: ..
      podSelector: ..

But looking in GKE, you can't see the kuber api-server pods to select?
Whereas in a kind cluster you can:

> k get po -n kube-system -l component=kube-apiserver
NAME                                       READY   STATUS    RESTARTS   AGE
kube-apiserver-argocd-demo-control-plane   1/1     Running   0          6h44m

I'll see if there is a "workaround" as this must be a fairly common problem, in that, I need to lock down access but enable app to connect to the k8s api.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably no quick fix. It would seem the common answer is exactly what you said @wallrj.
Ref: https://stackoverflow.com/questions/58790124/whitelist-kube-apiserver-with-network-policy

content/docs/installation/best-practice.md Outdated Show resolved Hide resolved
@wallrj
Copy link
Member Author

wallrj commented Nov 14, 2023

it would be great if the documentation could also include examples for multiple common CNIs like Azure and EKS (in addition to Calico).

I'm dropping the Calico examples for now. There will only be an overview of network requirements and I will leave it to the reader to figure out the configuration for their chosen CNI.

@jetstack-bot jetstack-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 14, 2023
content/docs/installation/best-practice.md Outdated Show resolved Hide resolved
content/docs/installation/best-practice.md Outdated Show resolved Hide resolved
content/docs/installation/best-practice.md Outdated Show resolved Hide resolved
Comment on lines 54 to 55
1. **UDP: cert-manager (all) -> Kubernetes DNS**:
All cert-manager components perform UDP DNS queries for both cluster and external domain names.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w/ another hat, I work w/ a dns software project and UDP DNS queries sometimes need to be resent over TCP, I can't tell if that's listed here or if there's some way that cert-manager could avoid that requirement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had the same thought.
But then I saw the default Calico policy which only allows UDP traffic to kube-dns,
and second guessed myself.

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: deny-app-policy
spec:
  namespaceSelector: has(projectcalico.org/name) && projectcalico.org/name not in {"kube-system", "calico-system", "calico-apiserver"}
  types:
  - Ingress
  - Egress
  egress:
  # allow all namespaces to communicate to DNS pods
  - action: Allow
    protocol: UDP
    destination:
      selector: 'k8s-app == "kube-dns"'
      ports:
      - 53

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked and the kube-dns does indeed accept TCP connections, so the Calico policy is wrong,
unless they've got some reason to block TCP queries.

$ kubectl get svc -n kube-system kube-dns
NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   6m59s
$ kubectl run dig --attach --rm --restart=Never --image nixery.dev/dnsutils -- dig @kube-dns.kube-system.svc.cluster.local kubernetes.default.svc.cluster.local. A +tcp

; <<>> DiG 9.18.19 <<>> @kube-dns.kube-system.svc.cluster.local kubernetes.default.svc.cluster.local. A +tcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53329
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: bb4d0b9fc7279ae9 (echoed)
;; QUESTION SECTION:
;kubernetes.default.svc.cluster.local. IN A

;; ANSWER SECTION:
kubernetes.default.svc.cluster.local. 4 IN A    10.96.0.1

;; Query time: 10 msec
;; SERVER: 10.96.0.10#53(kube-dns.kube-system.svc.cluster.local) (TCP)
;; WHEN: Wed Nov 15 09:57:05 UTC 2023
;; MSG SIZE  rcvd: 129

pod "dig" deleted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the docs to include TCP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI tigera/docs#1096 (comment)

I think the discussion will be more productive if you move it to the main project repo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

content/docs/installation/best-practice.md Outdated Show resolved Hide resolved
content/docs/installation/best-practice.md Outdated Show resolved Hide resolved
content/docs/installation/best-practice.md Outdated Show resolved Hide resolved
content/docs/installation/best-practice.md Outdated Show resolved Hide resolved
Co-authored-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
Signed-off-by: Richard Wall <wallrj@users.noreply.github.com>
wallrj added a commit to wallrj/docs that referenced this pull request Nov 15, 2023
In cert-manager/website#1344 (comment) we're discussing whether the default Calico deny policy should also allow TCP DNS queries.

Is there a reason to block them?
@hawksight
Copy link
Member

/lgtm

@jetstack-bot
Copy link
Contributor

@hawksight: adding LGTM is restricted to approvers and reviewers in OWNERS files.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hawksight
Copy link
Member

it would be great if the documentation could also include examples for multiple common CNIs like Azure and EKS (in addition to Calico).

I'm dropping the Calico examples for now. There will only be an overview of network requirements and I will leave it to the reader to figure out the configuration for their chosen CNI.

I think this is a good decision for now allowing users to implement their own best practice NetworkPolicy within their current setups. The docs should help them to work out what is needed. I must admit I don't enjoy working with NetworkPolicy and there seems to be a couple of issues that are maybe outside the scope of just cert-manager.

  • default deny across cluster (does it include DNS?)
  • Policy to contact k8s api-server... but what if the api-server is outside of the cluster (GKE)

Perhaps the community has some good examples or comments on these that we ca use to improve later on.

@inteon
Copy link
Member

inteon commented Nov 15, 2023

/approve
/lgtm
Based on feedback from @wallrj and @hawksight

@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 15, 2023
@jetstack-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hawksight, inteon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jetstack-bot jetstack-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 15, 2023
@jetstack-bot jetstack-bot merged commit 6860b03 into cert-manager:master Nov 15, 2023
8 checks passed
@wallrj wallrj deleted the network-policy branch November 15, 2023 14:26
wallrj added a commit to wallrj/website that referenced this pull request Nov 15, 2023
Thanks for all your documentation reviews Peter, for example:
 * cert-manager#1344
 * cert-manager#1338
 * cert-manager#1331

We'd like you to be able to `lgtm` future PRs,
so we're adding you to the reviewers list.

Signed-off-by: Richard Wall <richard.wall@venafi.com>
ctauchen pushed a commit to tigera/docs that referenced this pull request Nov 21, 2023
In cert-manager/website#1344 (comment) we're discussing whether the default Calico deny policy should also allow TCP DNS queries.

Is there a reason to block them?
ctauchen pushed a commit to tigera/docs that referenced this pull request Dec 1, 2023
In cert-manager/website#1344 (comment) we're discussing whether the default Calico deny policy should also allow TCP DNS queries.

Is there a reason to block them?
SgtCoDFish pushed a commit to SgtCoDFish/cert-manager-website that referenced this pull request Jan 18, 2024
Thanks for all your documentation reviews Peter, for example:
 * cert-manager#1344
 * cert-manager#1338
 * cert-manager#1331

We'd like you to be able to `lgtm` future PRs,
so we're adding you to the reviewers list.

Signed-off-by: Richard Wall <richard.wall@venafi.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants