-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promote and enhance vulnerability schema #38
Comments
Thanks for the feedback @garethr, it is very much appreciated! |
Thanks for the critique @garethr. Very much appreciated. CycloneDX extensions are designed to be iterated on, much faster than traditional standards process. So conceivably, we could use your critique when drafting v1.1 or 2.0 of the vulnerability extension. Some of the above bullets above are intentional, others are shortcomings, so there's a lot of possibility for improvement.
There are some sources of vulnerability intelligence that have very minimal data. Friends of PHP being a prime example of this. For example: https://github.com/FriendsOfPHP/security-advisories/blob/master/adodb/adodb-php/2018-03-06.yaml. Requiring an ID or description would prevent this vulnerability from being represented in a BOM.
True. CVSS is less than ideal but also (mostly) universally used. Does Snyk use something different or is there a different spec you'd like to see here?
That is correct and by design. A one-to-one relation between vuln and component means that the vulnerability can contain scoring specific to the component that is affected. For example, if a vuln has a CVE score defined, a human auditor can then apply an OWASP (or some other) score that provides additional risk-based analysis for a specific occurrence of a vulnerability. If multiple components could be affected by the same vulnerability, then all implementations of CycloneDX would need to do field-level comparison to ensure per-instance decisions were not made prior to using a reference. This is a lot of additional work implementations would need to do. However, with that said, I think there's room for improvement here and welcome suggesitons.
That is correct and by design. A vulnerability is currently defined as having a single source. So the source for CVE-2009-5155 would be the NVD, whereas the source for SNYK-DEBIAN9-GLIBC-338103 would be Snyk. What would be interesting is to expand this definition so that a vulnerability could have multiple ID (and thus multiple sources). That would be a really great enhancement IMO.
I agree. This needs improvement.
No, VulnDB and Sonatype are classic examples of this. Each service will provide the NVD score but in many cases will also provide their own CVSS score. What could be added is identity on who actually performed the score and then limit based on that. So prohibiting multiple scores from the NVD for example.
Yes its purposeful. XML has namespaces and CycloneDX allows (in many cases) arbitrary elements anywhere in the spec as long as they're in a different namespace. The CycloneDX JSON schema itself allows arbitrary properties anywhere as well. The proposal for the vuln extension in JSON is to be consistent with everything else. We find that many organizations use this capability to add their own flavor to CycloneDX without having to violate the spec.
I wish there were rules. The NVD for example is all strings. NPM Advisories can either be strings or markdown and they don't tell you. It's up to the consumer of their APIs to identify it prior to use. I think there's room for improvement here and welcome suggestions.
Currently strings as you pointed out, but I think the spec needs to be more prescriptive in what gets put there. So yes, I think extending the advisories to include perhaps a title, description, and url instead of a single description field would be ideal. Are there any blockers you see that would prevent Snyk from implementing CycloneDX with or without the vulnerability extension? Are there any improvements you'd like to see made to the vulnerability extension prior to adoption? Again, thanks for the feedback. Looking forward to iterating on this with you. |
Some flexibility and generic approach here might be good for supporting different scoring methods. But that could make automation harder for consumers. Maybe a decent approach is to have vulnerability scoring extensions? Then we could get really specific with different scoring methods, especially industry/vendor specific ones, without adding complexity to the base spec.
I think, in the context of an SBOM, a vulnerability needs to be thought of as component vulnerability information.
I sort of disagree on this. I think component vulnerability information here should have a clear identifiable single source. But support multiple reference IDs for cross referencing.
And perhaps just as problematic, it makes automation hard and won't scale. Perhaps this should be deprecated and replaced with vulnerability scoring extensions?
I think if we are going to include description/content it should use the attachedTextType which allows specifying a content type. But I think advisories should probably just be an externalReference for efficiency and consistency. |
Thanks for the responses.
My concern here was mainly about the potential for duplication and extra parsing, and larger SBOM file sizes.
My concern here is that as a producer you either need to choose to include only one ID (thereby limiting the information available to the consumer) or include several entries. This would further exacerbate the issue above with regards duplication information. Worse, any counts of instances of vulnerabilities would be incorrect. Having a canonical ID makes sense, but I think having a field for additional Identifiers would be useful.
Agreed. I think this solves for both 5 & 6. Method is parallel to who provides the score. eg. "ratings": [
{
"score": {
"base": 9,
"impact": 5.9,
"exploitability": 3.0
},
"severity": "Medium",
"source": "NVD",
"method": "CVSSv3",
"vector": "CVSS:3.0/AV:L/AC:H/PR:N/UI:N/S:U/C:H/I:N/A:N"
},
{
"severity": "None",
"source": "Debian Security Team",
"method": "Other"
},
{
"severity": "Low",
"Source": "Snyk",
"method": "Other"
}
],
+1 |
I think we also need to create some guidance on how to represent vulnerability information, and in particular recommendations. A component might have a vulnerability that is resolved in the next version of that component. But at the assembled software top level it is resolved by upgrading to the next version of that software. Or perhaps that vulnerability is only exploitable under certain conditions, or not at all. |
A suggestion. Maybe we should differentiate between raw vulnerability information and vulnerability assessment information? They are, in my opinion, quite different things. One has no context whatsoever about the software in question. Where as the other can have much more value when triaging an issue and remediation. |
@coderpatros I think that distinction is useful here. The SARIF rules/results is a similar idea:
It allows you to answer both:
You can also reason about things like:
At the very small end it's a bit more complex, but I think in real-world cases it's invariably going to be a better fit. |
I'll add, that IMHO, a real problem that might arise from a real world use of this, by a reader of the report – regards how to select which version to upgrade to, affecting only part of the vulnerabilities. So enforcing a schema for version specification in rules might also work well. Another idea I feel could be relevant regards separating raw vulnerability information and vulnerability assessment information. This is something we currently implement in our product, and find very informative. We currently focus on the following questions:
|
I really want to avoid concluding whether or not something is/is not exploitable in the spec. CycloneDX strives to be "facts first" so any solution that aligns with that ideal is great. @garethr do you have ideas for proposals to support the good ideas you've pointed out in #38 (comment)? |
@talz CycloneDX supports vulnerability remediation, including backports, patches, commits, etc. This is part of a components pedigree. Refer to https://cyclonedx.org/use-cases/#pedigree
This is an area with a high probability of false negatives and a topic of the NTIA VEX subgroup. False negatives are common when there's a lack of rules that can accurately describe what's possible, and is especially problematic in non-standard (unknown libraries, frameworks, and design patterns) and ployglot environments. This is extremely difficult to communicate accurately.
Under normal operation, this may be able to be determined depending on the capabilities of the dataflow engine, rules, monoglot vs polyglot, but ignores pivot points within a vulnerable application and entry points outside the scope of what's being analyzed. This is extremely difficult to communicate accurately. Both of the above questions also are limited in scope as they do not take into consideration the full stack. SBOMs are typically provided with distributables such as end-user software, IoT devices, medical devices, automobile assemblies, and consumer electronics, all of which have (in many cases) multiple software stacks and each stack comprising of hardware, firmware, possibly an OS layer, applications and libraries. The reachability of vulnerable code is most accurate through a combination of automated and manual analysis that will eventually end with a weighted analysis decision, along with all the evidence (and confidence of each piece of evidence) which led to the analysis decision. IMO, I think this is valuable but its outside of the scope of CycloneDX Core and is currently not planned for the Vulnerability extension. CycloneDX supports remediation (natively via component pedigree) and disclosure (via the vulnerability extension). There are currently no plans to support the opinion of auditors and/or tools to determine the exploitability of vulnerabilities or the effectiveness of patches or mitigating controls. The NTIA VEX subgroup asked the Core team to present during one of their recent meetings. Attached is the presentation from that. I think it may highlight what the project does with regard to vulnerability information and what it does not. That said, any organization or industry can create schema extensions that are applicable to their use case. This allow others to use the CycloneDX Core specification and built on top of it to meet their needs. |
@talz Can you elaborate on this?
I think I understand what you mean, but if you could elaborate, that would be great. |
Hey, in broad strokes - I'm talking about a enforcing a versioning schema, that'll allow a user to have control on comparing versions of different BOM entries. That is, breaking down version strings to their respective components: I think one can also break version part itself to something more canonical: |
Versioning is a nightmare at scale. The number of different versioning schemes is huge, and lots of them aren't distinguishable from just reading the version string. I think recommending, or providing guidance on how to use purl is worthwhile https://github.com/package-url/purl-spec (also from @stevespringett). Which solves your formatting problem you describe. But purl right punts on specifying the version in any rigid way
That makes version comparison a downstream problem unfortunately, or you limit the utility of the spec. It's not enough to say "CycloneDX versions should be semver" either. Because you can have versions that look like semver that aren't semver, so you can validate the output in a meaningful way that would mean a consumer could assume semver comparisons. |
@stevespringett when I get a moment I'm going to try and take the conversations in this thread and post a super-early proposal that we can kick around. Timing might not be for a week or two, in case someone else starts off. |
Finally found a moment. Draft PR in #44. Still WIP but wanted to start gathering feedback. |
I've created a gist to test some of these ideas out as well https://gist.github.com/garethr/b069d9bf84ca80635cd506e74c8e2247 The main observation at the moment is the overlap between the concepts of sources and ratings, when both become arrays. I think merging these might be useful. ie. in many (but not all) cases sources of vulnerabilities (NVD, Debian, security vendors) are also likely sources of ratings. Thoughts? So from something like: "sources": [
{
"name": "NVD",
"url": "https://nvd.nist.gov/vuln/detail/CVE-2010-0928"
}
],
"ratings": [
{
"score": {
"base": 9.8,
"impact": 5.9,
"exploitability": 3.0
},
"severity": "Medium",
"source": "NVD",
"method": "CVSSv3",
"vector": "CVSS:3.0/AV:L/AC:H/PR:N/UI:N/S:U/C:H/I:N/A:N"
}
] To something like: "sources": [
{
"score": {
"base": 9.8,
"impact": 5.9,
"exploitability": 3.0
},
"severity": "Medium",
"source": "NVD",
"url": "https://nvd.nist.gov/vuln/detail/CVE-2010-0928",
"method": "CVSSv3",
"vector": "CVSS:3.0/AV:L/AC:H/PR:N/UI:N/S:U/C:H/I:N/A:N"
}
] |
I like how you've tied the score to the source in that second example. |
A few additional comments:
|
@garethr An explanation of why the scoring was done the way it was. Red Hat does this occasionally in their security advisories. For example https://access.redhat.com/security/cve/CVE-2020-15862, you'll see a box with the title "Why did Red Hat score this CVE differently?" Right above that they highlight the differences in the access vector between the default scoring and their scoring, but that explanation helps provide additional information and context. That's all I'm looking for. |
The existing vulnerability extension has undergone real-world tests and has proven to provide value for audit use cases, machine-to-machine transport of BOM+vuln info, as well as output from various SCA and container tools used in production environments. Based on what has been learned over the past two years, I propose to bring the vulnerability extension into the core spec along with several improvements that will provide richer data. By doing so, vulnerability info will be available across all serialization formats. |
@garethr I'm going to use the standalone JSON schema extension you created earlier as a starting point and start moving it into the core spec along with some changes. From there, I plan to create a draft PR that we can discuss and once we get buy-in, we can port it to the XML and protobuf schemas. At that point, it would go through our documented standardization process just like all the other tickets we're working on for v1.4. |
Closing - as this is being released in CycloneDX v1.4 next week. |
Thanks for creating the JSON Schema variant of the vulnerability extension. This prompted me to take a run at describing some real world data using the format. Here's the full example https://gist.github.com/garethr/7dcc9d6ef4e7cc497e018dc279c00123 (bias warning, I work for Snyk, so I used Snyk in the comparison. I don't think any of the following is Snyk specific though, more about the general complexity of the domain and general usage of the output.)
Here are my observations from that experiment. Happy to discuss these here in one long thread or break out elsewhere if easier. Posting partly to discover the best way to discuss and appetite for addressing issues agreed after discussion.
id
,description
?)ie. Should advisories be defined as:
Hopefully the above is useful feedback, rather than being seen as too much criticism. Some might be me being new to the spec and missing things.
I'd be happy to help, and to discuss further if useful.
The text was updated successfully, but these errors were encountered: