Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion/Recommendation for CVE Announcement enhancements #170

Closed
zparnold opened this issue May 22, 2019 · 17 comments
Closed

Suggestion/Recommendation for CVE Announcement enhancements #170

zparnold opened this issue May 22, 2019 · 17 comments
Labels
inactive No activity on issue/PR suggestion New suggestion for the CNCF sig-security group that don't fall into an existing category

Comments

@zparnold
Copy link
Contributor

Hello there! I help lead the sig-docs-security working group in the Kubernetes project. We focus on surfacing and formatting security related info as well as helping to amplify the CVE management process. During our meetings a proposal has arisen to handle the CVE process in a more proactive way for our cluster operators. I asked this as a question during KubeCon EU's session on sig-security in Barcelona and was asked to open an issue here. The proposal is as follows. We would like to provide a very simple API for cluster operators to register to receive alerts based on what version of a CNCF project they are using:

POST /v1/register

{
  "project": "kubernetes",
  "version": "1.10.13",
  "alert_group": [
    {
      "method": "phone",
      "number":"+1123567890"
    },
    {
      "method": "email",
      "number":"me@email.com"
    },
    {
      "method": "http",
      "number":"https://someurl.com/url"
    }
  ]
}

This API would then return (if successful validation) a unique token (which has a low collision rate and high search space):

{
  "token":"kajshfdasjhdviurvubilrsbvaiurhvliauwhrvliuahrvliaurhiurhv"
}

This token can then be used to deregister the user who signed up:

POST /v1/deregister

{
  "token":"kajshfdasjhdviurvubilrsbvaiurhvliauwhrvliuahrvliaurhiurhv"
}

Which would then deregister the user from updates if the operation was a success.

Upon a verified CVE being published, this system could be used to trigger an alert to only those affected by the specific project/version combination. This allows for 100% signal to noise ratio as far as cluster operation and CVE management are concerned.

Pros:

  • It would be easy to construct
  • It helps to give our downstream consumers a sense of ease because we are there to get them the information they need just in time to make infrastructure decisions
  • This improves upon RSS and allows us to push information to operators much quicker than before
  • Since the IPv4 address space can now be probed in under 1 hour on commodity hardware, (https://zmap.io/) this gets information to operators as quickly as is feasibly possible

Cons:

  • A human will have to maintain this and be on call 24/7 for CVE announcements to operate this system
  • "Who watches the watchman" syndrome can occur and alerts could be dropped
  • This system itself could be subject to malicious use since attackers could have the same information that cluster operators do in the same time period (security through obscurity)

Thoughts?

@sftim
Copy link

sftim commented May 22, 2019

I'd be delighted to learn that there's already a good de facto standard for doing this, and that Kubernetes plans to follow that.

How about an ATOM / RSS feed where you can register a webhook for notifications, with a proof-of-work / CAPTCHA required to validate the registration?

Then, end users can strap anything they like onto the end of the webhook, and they can poll in case the webhook didn't arrive.

(extra credit: send the webhook in CloudEvent format, maybe with a digital signature).

@jimangel
Copy link

I really like this idea. I would like to see more outlined for how the actual app / FaaS would be managed long term in an open way.

As far as the data being collected / token exchange, where would that data be stored? How do we ensure privacy?

Lastly, it would be great to have some way to easily update your Kubernetes version (in the event you patch / major upgrade). I could also see it being beneficial to receive an event "Hey, you haven't updated your version in X months/years, do you want to update or delete this alert?"

@sftim
Copy link

sftim commented May 22, 2019

@jimangel it sounds like you think it'd be useful for the same ATOM / RSS feed to publish all releases, whether relevant to a CVE ID or not. (?)

@zparnold
Copy link
Contributor Author

zparnold commented May 23, 2019

I could also see it being beneficial to receive an event "Hey, you haven't updated your version in X months/years, do you want to update or delete this alert?"

I agree but it might be slightly out of scope, cause then it kinda becomes a marketing tool.

As far as how it will be managed, the actual components I'm envisioning are two fold.

  • The API is just a simple API deployed to any cloud provider (I would choose AWS cause I have the most experience there) and the info would be stored in a GDPR compliant (even though its anonymous) DynamoDB instance encrypted ten ways till Sunday. (Meaning that the table is encrypted at rest, and the data itself are(is?) encrypted via Amazon KMS...and all comms are via TLS.) That information is then used to subscribe uses to an AWS SNS topic that is specific to the version of the service they care about.
  • A worker function is crawling these RSS feeds every 60 or so seconds (or we work in conjunction with the core security teams of these projects.) And when something that is deemed actionable (that determination can be made later) someone(s) are paged via OpsGenie (or other provider.) They then make the final determination if a group needs to receive a message. When they do, they manually (from the AWS console) issue an alert to the appropriate topic.

Does that make sense?

@zparnold
Copy link
Contributor Author

@sftim The reasons I'm not advocating for an RSS approach is because:

  1. RSS is fundamentally a pull model which offers cluster operators or OSS project users no additional benefit from any previous system
  2. The Kubernetes CVE team pushes out releases to the Kubernetes Announce mailing list, which also has non-CVE related info...only adding to the problem where we must filter noise to distill signal (CVE's) from it.

My thought process is that we could open a new mailing list for Kubernetes if we needed to go the RSS/ATOM route, but I was hoping not to.

@sftim
Copy link

sftim commented May 29, 2019

What I'm suggesting is RSS-plus: a feed combined with a promise that you'll get a webhook when it changes.

The recipient of the webhook can verify the information by fetching the RSS / ATOM / whatever, if they want to.

  • Don't fully trust the webhook to arrive? You can still poll the feed.
  • Kubernetes project is worried about abuse? A proof-of-work / proof-of-pulse check before your webhook subscription gets approved.
  • Don't like CloudEvent payloads? The subscriber can strap something onto the receiving end of the webhook to transmute it into what they do want: SMS message, buzzer, Prometheus scrape, carrier pigeon release, etc

@sftim
Copy link

sftim commented May 29, 2019

Toy implementation: run https://github.com/skx/rss2hook on CNCF infrastructure with appropriate configuration. Put the configuration in a git repo, take pull requests, link that to the app with a ConfigMap.

@zparnold
Copy link
Contributor Author

zparnold commented May 29, 2019

@sftim This makes more sense. Thanks for this by the way! I suppose I don't understand CloudEvents well, is it possible for us to manage the effort around alerting some common forms? (PagerDuty, OpsGenie, SMS, Phone Call, Email?)

I guess I'm aiming to have the CNCF/Us be responsible for alerting people to the threat in a few commonly supported formats. So I love that they can still poll the RSS endpoint and I'm on board with that, but I don't understand the next step you're talking about. (PoW or CE Payloads..)

@sftim
Copy link

sftim commented May 29, 2019

My concern: does CNCF want to promise alerting by SMS & phone call for cluster operators worldwide? The foundation aims to be inclusive; offering a service only in some countries might look like the opposite. Offering the service globally could be inclusive but costly - I've seen this with international phone calls as part of multifactor authentication.

Offering a webhook and nothing else is a lowest common denominator: it's safe to assume that cluster operators have internet and fair to assume that they have something that can run a web server.

PS. The notification payload it doesn't have to be CloudEvent, any JSON format would be fine. CloudEvent does seem like a nice standard though.

@zparnold
Copy link
Contributor Author

zparnold commented May 29, 2019

@sftim It's sadly not necessarily safe to assume all clusters have access to the internet (I know of at least one really big cluster that is completely airgapped,) I assume the cluster operator would have access to the internet, but then we're adding additional infrastructure for them to run.

As for CloudEvents I'm all for standards, so sure!

@lumjjb lumjjb added the suggestion New suggestion for the CNCF sig-security group that don't fall into an existing category label May 30, 2019
@lizrice
Copy link
Contributor

lizrice commented Jun 5, 2019

I like the broad idea of this. cc @caniszczyk

@caniszczyk
Copy link
Contributor

FYI GitHub now has a security advisory feature in beta you may want to take advantage of, happy to enable it for any CNCF project if you don't have it: https://help.github.com/en/articles/creating-a-maintainer-security-advisory

@stale
Copy link

stale bot commented Mar 17, 2020

This issue has been automatically marked as inactive because it has not had recent activity.

@stale stale bot added the inactive No activity on issue/PR label Mar 17, 2020
@lumjjb
Copy link
Collaborator

lumjjb commented Jul 14, 2021

@PushkarJ do you think that this is of relevance still today? Any comments on this from k8s security perspective? If not we will probably close due to scope.

@stale stale bot removed the inactive No activity on issue/PR label Jul 14, 2021
@PushkarJ
Copy link
Collaborator

@lumjjb Agree with you that the scope is too wide at the moment. We are experimenting with some ideas on triage, vulnerability resolution and transparency in kubernetes sig-security. Perhaps once we have a working model and process, we could do a presentation about our approach in CNCF TAG-Security meeting and then explore how this can be adopted across all the CNCF projects (maybe as graduation criteria) ;-)

So, with that said, we can close this and when the time is right, I will open a new issue and link it to this one at that time for completeness. Hope that works for all!!

@stale
Copy link

stale bot commented Sep 14, 2021

This issue has been automatically marked as inactive because it has not had recent activity.

@lumjjb
Copy link
Collaborator

lumjjb commented Sep 15, 2021

Closing this issue for now, linked to tracking k8s sig-security issue and will re-open when ready to present to the TAG.

@lumjjb lumjjb closed this as completed Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive No activity on issue/PR suggestion New suggestion for the CNCF sig-security group that don't fall into an existing category
Projects
None yet
Development

No branches or pull requests

7 participants