Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding CAA record #1849

Merged
merged 1 commit into from
Sep 30, 2021
Merged

adding CAA record #1849

merged 1 commit into from
Sep 30, 2021

Conversation

jimangel
Copy link
Member

@jimangel jimangel commented Mar 31, 2021

Adding a CAA record for kubernetes.io to prevent non-explicit CAs from issuing certificates for k/website.

A couple things to look out for:

  • Are there any other CAs issuing certs for kubernetes.io?
  • If this works, it will probably need to be added to k8s.io zone as well
  • I'm not sure if the quotes need to be escaped
  • I'm not sure if the CAA type: will take a map of values or how it interfaces with Google DNS

Any other concerns?

/cc @celestehorgan

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/dns DNS records for k8s.io, kubernetes.io, k8s.dev, etc., code in dns/ size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 31, 2021
@ameukam
Copy link
Member

ameukam commented Mar 31, 2021

/assign @thockin @spiffxp

@celestehorgan
Copy link
Contributor

@jimangel – took a peep, lgtm. We'll need to test it once this goes live.

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we test this? Maybe add CAA for one specific low-risk name, verify it, and then expand?

@@ -18,6 +18,10 @@
preference: 10
- exchange: alt4.aspmx.l.google.com.
preference: 10
- type: CAA # Which CA's can issue certs for the domain
values:
- 0 issue \"pki.goog\"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do we get these string values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string came from a generated CAA record using https://sslmate.com/caa/, I used the option to check for currently used CA's via https://certificate.transparency.dev/ (logs). I know we're primarily concerned with LE (via netlify) but from that generation, it pulled in pki.goog. I think it's safe to assume we use(d) Google's CA for managed certs too.

Interestingly, k8s.io CA's also include amazon:
image

dns/zone-configs/kubernetes.io._0_base.yaml Outdated Show resolved Hide resolved
dns/zone-configs/kubernetes.io._0_base.yaml Outdated Show resolved Hide resolved
@celestehorgan
Copy link
Contributor

@thockin this needs to go live before we can test it on k8s.io I think? I took a peek at the Netlify docs and I'm not sure there's a way we can test for certificate authority without it going live.

@jimangel
Copy link
Member Author

jimangel commented Mar 31, 2021

this needs to go live before we can test it on k8s.io I think?

I can do a smoke test with a personal domain through netlify (a breaking test and a working test). It wouldn't account for all edge cases but it would validate k/website won't fall over. Any other low-risk domains owned by Kubernetes come to mind @thockin?

Also, given the potential impact of unknowns we could send out an email to kubernetes-dev announcing 24h before merging. It will raise awareness and give folks the opportunity for "whatabouts."

Edit: We can also test with a subdomain first if we feel that's safer 🤷

@jimangel
Copy link
Member Author

jimangel commented Apr 5, 2021

I have a domain mydns.dog. managed by Google DNS (as is kubernetes.io) and thought it would be a good test via Netlify using the domain broken.mydns.dog. To test, I'll create a breaking CAA record and then fix it.

First I created an octoDNS container using the following Dockerfile (this is before I found the repo's Dockerfile):

FROM python:3-alpine

RUN pip install --no-cache-dir \
  octodns==0.9.11 \
  google-cloud-dns==0.32.2

# docker build -t octodns:alpine . --no-cache

Next I created the following production.yaml config in the root directory based on the octoDNS readme and the octodns-config.yaml in this repo.

providers:
  config:
    class: octodns.provider.yaml.YamlProvider
    directory: /tmp/config/ # DIRECTORY I USE IN THE CONTAINER
    default_ttl: 3600
    enforce_order: True
  gcp:
    class: octodns.provider.googlecloud.GoogleCloudProvider
    project: out-of-pocket-cloudlab

zones:
  mydns.dog.:
    sources:
      - config
    targets:
      - gcp

I captured the current mydns.dog. zone as-is using the following command:

# export GCP service account creds
export GOOGLE_APPLICATION_CREDENTIALS=/Users/jimangel/Downloads/credentials.json

# dump the current zone
docker run \
  -e GOOGLE_APPLICATION_CREDENTIALS=/tmp/keys/credentials.json \
  -v $GOOGLE_APPLICATION_CREDENTIALS:/tmp/keys/credentials.json:ro \
  -v $(pwd):/tmp/ \
  octodns:alpine /bin/sh -c \
  "octodns-dump --config-file=/tmp/production.yaml --output-dir=/tmp/config/ mydns.dog. gcp"

Then I added a line for a CAA record for Google PKI (knowing that I want Let's Encrypt, and it should break):

# note that I am applying this to the whole domain, not a single subdomain.
'':
  - type: CAA # The CAA record set for a domain also applies to all subdomains. If a sub-domain has its own CAA record set, it takes precedence.
    values:
    - flags: 0
      tag: issue
      value: pki.goog

I applied the above config with the following command:

docker run \
  -e GOOGLE_APPLICATION_CREDENTIALS=/keys/credentials.json \
  -v $GOOGLE_APPLICATION_CREDENTIALS:/keys/credentials.json:ro \
  -v $(pwd):/tmp/ \
  octodns:alpine /bin/sh -c \
  "octodns-sync --config-file=/tmp/production.yaml --doit"

Next I created a bootstrap basic site in Netlify using a custom domain broken.mydns.dog.

image

Given the output, I also needed to create a CNAME, so why not use octoDNS 🤷:

broken:
  - type: CNAME
    value: gallant-heyrovsky-73c320.netlify.app.

I applied with the same --doit command as above and worked with Netlify support to get the public Let's Encrypt auth URL (https://acme-v02.api.letsencrypt.org/acme/authz-v3/12087973505):

...
        "detail": "CAA record for broken.mydns.dog prevents issuance",
...

image

I updated the production.yaml and included Let's Encrypt and applied with the same --doit command:

'':
  - type: CAA # The CAA record set for a domain also applies to all sub-domains. If a sub-domain has its own CAA record set, it takes precedence.
    values:
    - flags: 0
      tag: issue
      value: pki.goog
    - flags: 0
      tag: issue
      value: letsencrypt.org

Once again working with Netlify support to get the public Let's Encrypt auth URL (https://acme-v02.api.letsencrypt.org/acme/authz-v3/12088962179):

...
      "status": "valid",
...

I can also validate by navigating to the site:

image

Lastly, we can test this outside of Netlify using certbot. I won't include all the details but I did the same test using the following commands:

sudo apt install certbot -y
export DOMAIN="broken.mydns.dog"
sudo certbot certonly --manual --agree-tos --manual-public-ip-logging-ok --register-unsafely-without-email --preferred-challenges=dns -d ${DOMAIN}
# add DNS TXT records and if needed test with `dig _acme-challenge.${DOMAIN} TXT`
# view the output

At this point, I feel comfortable with this merging from a technical standpoint. It's still worth raising visibility on this change in case there's a strange edge case out there issuing certs.

Once ready, I would suggest merging this PR and afterwards open a PR for k8s.io including the Amazon CA that we discovered.

Note: I learned while I was doing this that octoDNS (and Google DNS) subdomains do not support CNAME + CAA record. If it has a CNAME, it can't have any other records.

@thockin
Copy link
Member

thockin commented Apr 5, 2021

SGTM. Want to send email?

@jimangel
Copy link
Member Author

jimangel commented Apr 5, 2021

SGTM. Want to send email?

Sure! With 1.21 getting cut on Thursday, I think we should hold off until Friday / Monday afterwards. Unless anyone feels strongly about getting it in ASAP.

@spiffxp
Copy link
Member

spiffxp commented May 19, 2021

/priority important-soon
/milestone v1.22
/cc @dims @ameukam
Is there an open issue to track this @jimangel ? I want to make sure this is on our radar but we only track issues in https://github.com/orgs/kubernetes/projects/6

@k8s-ci-robot k8s-ci-robot requested review from ameukam and dims May 19, 2021 21:30
@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label May 19, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.22 milestone May 19, 2021
@jimangel
Copy link
Member Author

Sorry @spiffxp, this fell off my radar. I've opened up the issue here: #2216

Should we send anything out or merge and iterate?

It seems easy enough to revert too...

@ameukam
Copy link
Member

ameukam commented Jun 15, 2021

/lgtm
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 15, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 15, 2021
@sftim
Copy link
Contributor

sftim commented Jun 15, 2021

Seems OK. I'd publicize this to infra folks and maybe others too before merging.

@spiffxp
Copy link
Member

spiffxp commented Aug 4, 2021

/milestone v1.23

@k8s-ci-robot k8s-ci-robot removed this from the v1.22 milestone Aug 4, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Aug 4, 2021
@jimangel
Copy link
Member Author

👋 Hey all, what needs to be done to merge this? Happy to drive it forward, unclear about the next steps. Thanks!

@spiffxp
Copy link
Member

spiffxp commented Sep 3, 2021

Should we send anything out or merge and iterate?

I'm vaguely unclear on the consequences of merging this right now, let's try to get this merged next week when we have more folks around to handle any repercussions

@jimangel
Copy link
Member Author

jimangel commented Sep 5, 2021

I'm vaguely unclear on the consequences of merging this right now

After this PR is merged, only Google's CA and LetsEncrypt's CA can issue TLS/SSL certificates for only the domain "*.kubernetes.io". Any other CAs (DigitCert, Comodo, Entrust, GlobalSign, GoDaddy, etc) would fail. As far as we can tell, based on public record, Google's CA and LetsEncrypt's CA are the only two CAs used by kubernetes.io.

As of 2017, Certificate Authority Authorization (CAA) record checking is mandatory for all Certificate Authorities.

In a "worst case" scenario, someone using an alternative CA couldn't issue a new certificate for *.kubernetes.io and would have to add their CA to the accepted list (modeled in the current PR here). The certificate rejection response would indicate it's a CAA issue leading them hopefully back to here / Slack. I also hope that renewing critical PKI infrastructure isn't done "just in time" and there's a renewal before expiry to remediate any issues.

let's try to get this merged next week when we have more folks around to handle any repercussions.

Works for me! After we get this one in, I'll open one for k8s.io (which has one additional CA for AWS).

@sftim
Copy link
Contributor

sftim commented Sep 7, 2021

I suggest also using https://github.com/kubernetes-sigs/contributor-tweets/ to publicize the announcement within the contributor community. Would that be useful?

@jimangel
Copy link
Member Author

jimangel commented Sep 29, 2021

Capturing some findings from our meeting today (SIG-k8s-infra | Sept 29th 2021):

  • kubernetes.io (this PR)
    • LetsEncrypt was used by infra / cert manager before moving to Google managed certs.
    • LetsEncrypt is still used today by Netlify (k/website)
  • k8s.io (a future PR)
    • LetsEncrypt is used by some services / subdomains like cs.k8s.io
    • Some infra stuff uses Google managed certs (assuming)
    • There's AWS issued certs for kops e2e testing

TL;DR: The known knowns are accounted for and should not be removed from the list now or in the future without validation. We can still send out an FYI before / after merging to the k-dev mailing list if anything comes up. A revert is just a PR away if anything goes really wacky :)

@spiffxp
Copy link
Member

spiffxp commented Sep 29, 2021

I got a little shy about slamming approve right away since I'm about to go AFK, I'll do this first thing tomorrow

Copy link
Member

@spiffxp spiffxp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jimangel, spiffxp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 30, 2021
@spiffxp
Copy link
Member

spiffxp commented Sep 30, 2021

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 30, 2021
@k8s-ci-robot k8s-ci-robot merged commit bc7c67b into kubernetes:main Sep 30, 2021
@spiffxp
Copy link
Member

spiffxp commented Sep 30, 2021

Notice sent to kubernetes-dev@: https://groups.google.com/g/kubernetes-dev/c/JCaVGcV_PH0

@spiffxp
Copy link
Member

spiffxp commented Sep 30, 2021

I suggest also using https://github.com/kubernetes-sigs/contributor-tweets/ to publicize the announcement within the contributor community. Would that be useful?

Personally I'd rather wait to raise signal beyond the audience on kubernetes-dev, but I'm willing to be wrong on this if someone would like to take the lead here.

@jimangel jimangel deleted the caa-dns-le branch December 13, 2021 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/dns DNS records for k8s.io, kubernetes.io, k8s.dev, etc., code in dns/ cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants