-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Proposal: Expand Options for Data License in SPDX Documents #8
Comments
In support of this change proposal, I picked up feedback at the LF Open Source Summit NA conference that having this as a required field has negatively impacted adoption for security use cases. I would also like to suggest that we make this property optional - that would remove all objections. From my discussions, even having a required license field incurs extra overhead for some of the SBOM producers. |
Are there not some Fedora/Cygwin?/RedHat?/IBM? concerns with licensing some things under CC0 raised by @richardfontana or is this considered allowed content under their rules? |
@Pizza-Ria Thanks for re-opening this discussion, and for the detailed issue statement and proposed solution. Speaking for myself (and not for the SPDX legal team generally), I'm in favor of implementing this change proposal, with a few comments / caveats. CC0-1.0 is 100% sufficient for my own SPDX document-making needs, so I don't personally have a need for a change here, but I understand the desire for a broader range of options here. In particular, I share your concern about the potential confusion arising from requiring the SPDX Document to be labeled as I think my bigger concern, though, is generally with mandating that an SPDX Document MUST be licensed under a particular license, whether CC0-1.0 or something else. I am not aware of any other data format specification which mandates a particular license for data expressed using that specification. (There may be some! I just don't know any.) I guess my concern with mandating CC0-1.0 is really about, between the All that said, I do think that the community (both in the specification itself, and in ancillary materials) should continue to recommend that users SHOULD use CC0-1.0. I continue to think that, even for the newer use cases, the greatest utility for SBOMs and for software transparency will come from an absence of restrictions on their usage by recipients. I'm a bit skeptical of how useful an SBOM will be if a recipient / consumer has to figure out "OK, I can do X but not Y with it". And I know that a lot of bog-standard NDAs have terms such as "thou shalt not copy any Confidential Information" or other such terms which will be kind of nonsensical for SBOMs. So I think it's absolutely worth the SPDX community continuing to recommend that SBOM producers SHOULD use CC0-1.0, but I'm +1 on removing it as a mandatory obligation. As far as what to change to, I tend to agree with the proposal that any SPDX License Expression should be permitted, including NONE and NOASSERTION. I also agree with @goneall's request that |
Just going to get the point in spdx/spdx-spec#850 (comment) here too.
|
The product list data in an SBoM may not be copyrightable in the US, but may be elsewhere in the world, as are the published documents in which they appear, and each products' licence expressions by the packagers, as this is typically not data included in the project source, although a licence instance may be (e.g. GPL2), or it may just be named (e.g. as GPL2+) or linked in the project docs, to a mailing list post, or quoted from a personal (e-)mail. |
The linked Fedora issue is for code. SBOM is clearly not code but "content". While the US is but one jurisdiction, it does show that there needs to be caveats spelt out clearly to the user if one wants to use a "copyright license" such as (well, most of the Annex A) GFDL. These licenses revolve around having some right to license out; there's no copyright without reaching a threshold of originality and DB rights are not widespread. And SBOM is exactly the type of thing that is likely to fail this bar: it's most commonly spit out by a program and never edited. |
The Fedora policy is not applicable to this issue for two reasons: (1) an SBOM would be treated as "content" rather than "code" or "documentation" and CC0 is permitted for "content"; (2) there is pretty much zero likelihood that Fedora would ship an SPDX Document as far as I can foresee. Fedora's interest in SPDX is purely around the expression language and the license/exception list as a basis for license approval classifications and package license metadata. |
As @Pizza-Ria noted above, it would help if people commenting on this proposal familiarize themselves with the background on this field which is discussed in https://wiki.spdx.org/images/SPDX-TR-2014-1.v1.1.pdf and https://wiki.spdx.org/view/Legal_Team/Decisions/SPDX_Metadata_License:_Preamble_and_CC0_1.0_Universal When this issue came up previously, I wondered about simply removing this field altogether... not sure how I feel about that or if that would make a bigger mess. In any case, having some flexibility as @Pizza-Ria suggests, is probably the best option. I like @goneall suggestion of also making the field optional and agree with @swinslow idea of still encouraging CCO-1.0 |
TR-2014-1.v1.1 also does not actually consider CC0 incompatible with confidentiality. The only downside given was implying there could be any copyright on this data in the first place. The final decision text avoids that issue by saying "CC0 if copyrightable at all; a no warranty disclaimer otherwise". What I think is needed is a short writeup from the legal team about how exactly CC0 is compatible with confidentiality. It will be made easier if we can use an We could still go down the route of full license flexibility, but again we need some "copyrightable at all" language similar to the one we have in the decision. Adding restrictions based on rights you probably don't have is not very useful. The last thing we want is a bunch of computer-generated SBOMs floating around claiming they are CC-BY-NC-ND. That just wastes everyone's time. |
@Artoria2e5 I don't think I agree that a further analysis or writeup is needed here on "is an SBOM copyrightable or not." Ultimately, broadening the DataLicense field to be both optional and also open to use any valid SPDX license expression essentially leaves it up to the SPDX document producer. Some folks will make SBOMs without using that field at all; some will continue to put CC0-1.0 on them (I will!); and some might pick other licenses from the License List and/or custom terms. That's fine; if someone produces an SBOM with terms that no one else wants to use, then it won't get used. Or, if someone produces an SBOM with aggressive license terms, recipients are free to make their own determination that copyright doesn't apply and that they don't have to comply with those terms. I guess what I'm saying is that changing DataLicense to be optional and also open to any expression gets the SPDX community out of being in the middle of specifying how SPDX documents need to be licensed, which I think is the goal here (at least, it's a goal I'm in favor of). |
Amazon would prefer that this field be optional. And optional or not, that it allow any license tags, including “none” ”no assertion” and “license-ref-*" |
Simply to confirm, this whole change proposal is only about the DataLicense property of SPDXDocuments in SPDXv2, correct? More specifically it is not about SPDXv3, where Having different licenses in SPDXv3 context will be detrimental to adoption and use of SPDX in general. As an example, I might receive the information that there is The presence of different Or, in a more likely scenario, SPDX consumers will decide that such information is not copyrightable and proceed to ignore Since I think having data compatibility between SPDXv2 and SPDXv3 is a worthwhile goal, in my opinion, I am strongly in favor of only allowing CC0-1.0 in every case. |
Ah, I've just noticed that, while the proposal is titled "... in SPDX Documents" and text keeps referring to "SPDX documents" and "SBOMs", the "scope of impact" also includes:
I would like a clarification whether this is a proposal about the data license of "SPDX Documents" or of every piece of information expressed in SPDX format. Please keep in mind that in SPDXv3, "SBOMs" and "SPDXDocuments" are not the only/main way of structuring the information. Having different licenses in every single piece of information (e.g., the validity of a CVE for a software package) will result in unusable data... As I wrote above, I am (now even more) strongly in favor of only allowing CC0. |
Although this can be quite inconvenient for the consumer to have to deal with the possibility of different data licenses for each element, I can see a use case where this is a benefit. Consider the case where you are aggregating data from public and private sources and you want to make sure no private SBOM data is exposed. You can use the data license to filter appropriately and implement compliance procedures for what SBOM elements are provided publicly. A couple possible compromises:
|
I've been out on vacation and just catching back up on this thread. Open to discussing compromises but having a single hard requirement makes this very difficult for companies under NDA restrictions (+ other concerns) to be able to have a compliant SBOM under the current standard. |
Since an SBOM may carry sensible vulnerability disclosure information, I would argue that I‘d like to use an arbitrary license expression. Compatibilty concerns must not block us here. |
still pondering this, but what if the field had prescribed options of: CC0-1.0 or NOASSERTION? I'm not so sure about using NONE, as that kind of makes a statement in and of itself, but NOASSERTION is like leaving it blank without doing so and is consistent with the idea that there may not be copyright to begin with (under US law). I still am having trouble with the conflation of confidentiality and license - if something is CC0-1.0 (or any other permissive license, for that matter). I can still wrap confidentiality terms around that without a "license compatibility" issue. (I'm using quotes b/c technically legally, a public domain dedication is not a license and neither is an NDA) |
as to @zvr comment as to v3.0 - perhaps whatever is chosen should be consistent across all the different profiles or data? It would be quite inconvenient and add friction to have it vary within one SBOM/SPDX Document. |
As discussed in the joint legal/tech meeting minutes at https://github.com/spdx/meetings/blob/main/legal/2023-07-27.md, there was broad consensus from attendees that CC0-1.0 should no longer be mandated as the DataLicense for future versions of the SPDX specification. There was not a clear consensus about the best alternative path forward. Two options were proposed and attendees' preferences were split between these two:
The minutes linked above contain more details about the discussions of these two options. Since there was not a clear consensus, I'd encourage folks with an opinion to weigh in here. |
Given people are merging/splitting SBOM documents, as well as starting to use them in databases, there is likely the case where a 2.3 Doc will be merged with a 3.0 Doc at some point. With this in mind, I would prefer to see us keep the DataLicense field but make it optional, rather than get rid of it all together. I am +1 on allowing having the field accept any valid SPDX license expression (including use of NONE & NOASSERTION). |
+1 to permit any valid SPDX license expression |
Kate raises a good point. The question is whether it is acceptable to loose the data license information if we take an SPDX 2.3 document translate it to 3.0 and potentially translate it back to a 2.3 format (not sure if the tools will support the latter). I'm OK with either proposal - my vote would be for whichever gets us to a decision fastest so we can include it in the SPDX 3.0 RC2 release ;) |
I still lean toward the original but I wouldn't object to the "get rid of it" option if the EU legal list is OK with that option. |
For anyone commenting here who wasn't at the joint call, please do review the minutes here: https://github.com/spdx/meetings/blob/main/legal/2023-07-27.md To add to that, on today's general call, @goneall clarified the state of the dataLicense field for 3.0 which may be slightly different than some of us understood (including me) on the July 27th discussion. The field is filled out once and then replicated across the elements; it is not required to be filled out at each element individually. This is better than what I had imagined, but if we keep this field, we should somehow ensure that someone can't change the field for some elements - it should be consistent. For those who are in favor of 'keep field/allow any SPDX license id' - what would the outcome be if someone used GPL-2.0 or other copyleft license on the downstream distributions of the SPDX data? |
My preference is "get rid of it", followed by "make it optional", followed by "any license". |
I repeat my grave concerns about having different licenses for every piece of information instead SPDX data (think of it as different license for each line in an SPDXv2 Document). I believe usability will suffer greatly. In the future upstream projects may publish their own SPDX information; if it is allowed, the default might be to use the project license, so we might get the OpenSSL SPDX data under To avoid this messy situation, and since it seems mandating To @kestewart 's point, I do not see an issue with a conversion of SPDX data v2->v3 dropping the CC0 initial marking from inside the data. If there is a strong common desire to retain Last, but not least, another important note for those who do not follow the development of SPDXv3: licensing information is no longer properties, it's expressed via relationships between artifacts and license expressions. It makes no sense to have a [This un-threaded way of sequential comments is not really convenient for discussion] |
@zvr:
Yup :) For future Change Proposals, perhaps we can either use the GitHub Discussions functionality at https://github.com/spdx/change-proposal/discussions, or else use the mailing lists as the main place for discussions to occur. I lean a bit towards the former mostly out of not wanting to pollute folks' inboxes for lengthy conversations. = = = = = For the choices here, I'm now leaning pretty strongly in favor of "get rid of it altogether," for the following reasons: Having the To the extent that SPDX document creators want to impose confidentiality terms on a bilateral basis with a recipient, that's fine; they can do so using an NDA between themselves just like any other confidential data they want to exchange. And if so, they can determine what markings (inside or outside the SPDX document) they want to use to communicate that they are treating the document as confidential. None of that necessitates having a specific field in each data element to communicate that information. Finally, if there is a It's would be honestly weird to have a license property in the element which is the license for the metadata about the element, but is not the license itself. I think you'll have a very difficult time explaining to the broader community how to do this correctly. Given all of the above, I'm in favor of dropping the DataLicense property. |
In reviewing the above comments, I came up with a possible solution to @zvr's issue. I created a separate issue in the SPDX 3 model with the proposal so that we don't add yet another thread to this issue. If you have any thoughts on the proposal, please add comments to the proposal itself. |
We certainly need the ability to assert we are not releasing the SBOM data to the public with no restrictions. Many Organization's SBOMs will contain SBOMs from other [upstream] Organizations who have strict NDAs in place, so it may not be your Organization's ability to choose to release SBOM data without restrictions. Allowing a "CUSTOM NDA" value is useful here. |
meant to add notes from call where we discussed this, please review if you were not in attendance! |
@hibbardc - to be clear, nothing in the current configuration (using CC0) prevents people from using an NDA. |
We’re concerned that one member of an organization having signed an NDA to obtain our SBOMs may not inform every other member of their organization of the NDA. Afterward, other members of that org (without knowledge of the signed NDA) may view the SBOM and see the data contained therein is released under CCO-01 and share the data outside their Org. This is one of those genies that tend not to go back into the bottle so easily. We wish to follow the SPDX specification for SBOMs without risk of the recipients believing our data is released into public domain or free to share with others. |
We can't assume everyone reading the SBOM will understand the legal difference between a public domain dedication, a valid license, and an NDA. This is about preventing accidental open distribution of data under NDA, not the least verbose form of legal accuracy. |
understood and I would note that that is not an (internal) problem related to confidential information that is limited to SBOMs! The NDA most likely does not follow around the information which it is intended to protect in most cases. still wondering if there is a better way to address this issue... |
from email to tech team re: element v. document question from Oct 10th-ish (putting here for thoroughness Regarding whether individual elements need a license - the short answer is no. It "may" make sense on a collection or SPDX Document as a whole, i.e., XCollection/SPDXDocument/SerializableCollection The original rationale for this field (see 6.2.1 and 6.2.2) considered copyright and database rights (which are not the same thing, so we should be careful to think and speak of them distinctly). Under US law, facts are not copyrightable, so having a license on each element is largely unnecessary and would only provide complexity and (further) confusion. Where database rights may be applicable, the determination for that usually involves looking at the effort involved to create the database, which would not apply to individual elements of data. We need to develop some more thorough explanations of why we have CC0-1.0 to begin with and then whatever we end up doing for that field going forward (for any datalicense field for the SPDX Document) |
Hello all, my apologies for the delayed follow-up here. Following further discussions among the legal and tech team leadership, here's the consensus we've identified based on the discussions surrounding the DataLicense field:
@goneall @zvr @jlovejoy @Pizza-Ria Please feel free to weigh in if you have different thoughts here; otherwise, I believe the team will proceed forward on this basis for SPDX 3.0. |
Owners
• Ria Farrell Schalnat (Pizza-Ria) – Open Source Program Manager for Hewlett Packard Enterprise and Chris Hibbard – Open Source Security Architect
Issue Statement
• SPDX Version 2.3 Section 6.2 contains a Data License Field (see also, associated file under the spdx-3-model) as part of the SPDX document creation information section which requires that any SPDX-Metadata be subject to the terms of the Creative Commons CC0 1.0 Universal license. The stated intent of this license choice is to alleviate concerns “that content (the data or database) in an SPDX document is subject to any form [emphasis added] of intellectual property right that could restrict the re-use of the information or the creation of another SPDX document for the same project(s).” See also, Legal Team/Decisions/SPDX-Metadata-License - (January 18, 2012). The explanation continues that “individuals can still contract with each other to restrict release of specific collections of SPDX documents (which map to software bill of materials) and the identification of the supplier of SPDX documents.”
• The decision surrounding the license for the SPDX metadata, while positively intended to foster communication and collaboration, was made prior to the advent of SBOM requirements in various jurisdictions including the those in the United States stemming from Executive Order 14028 (May 12, 2021). These requirements will eventually cover the entire software supply chain for a software product sold to the federal government. This may involve many tiers of both open source and proprietary software including code distributions subject to confidentiality or other contractual restrictions. Sharing such SBOMs under a license that effectively waives any copyright and database rights carries the potential for, at least, confusion and misinterpretation regarding the ability to share such documents especially if contradictory provisions are attached in other fields such as comments.
• While Hewlett Packard Enterprise (HPE) recognizes that the original choice of license satisfies many use cases for SPDX documents that may be freely shared throughout the supply chain, we are concerned that these clauses contradict one another where information outlined in an SPDX document (e.g., SBOM) may be subject to confidentiality or trade secrecy (either inherently or via additional contractual restrictions) especially since the CC0 license may be considered “public domain” (see, “CC0 enables scientists, educators, artists and other creators and owners of copyright- or database-protected content to waive those interests in their works and thereby place them as completely as possible in the public domain. [emphasis added]”).
• HPE is particularly concerned that utilizers of the SPDX document may not be aware of such confidentiality/trade-secrecy/contractual obligations and, either directly or via machine automation, not recognize the obligations associated with such documents. The consequence is that a recipient may inadvertently share such SBOMs under the assumption that the CC0 license permits such sharing. Adding commentary or additional metadata to the SPDX document runs the risk of further confusing the issue or having such additional restrictions overlooked.
• Entering an alternative identifier for the Data License field results in SPDX validators failing which raises questions from recipients as to the efficacy of the SPDX document.
• Thus, use of the SPDX standard for communicating SBOMs either runs a risk of disclosure or creates friction/manual work in explaining why our SBOMs do not pass validators if a different license is chosen. Furthermore, we do not believe this situation is unique to HPE and could have a prohibitive effect regarding adoption of SPDX as the chosen SBOM standard in the industry.
Proposed Solution
• Update the documentation and validators to accept alternative entries to the Data License field in the documentation metadata including licenses already recognized by SPDX
• (https://spdx.org/licenses/),
• “none”/”no assertion” as defined in https://spdx.github.io/spdx-spec/v2.3/file-information/#85-concluded-license-field, and
• other licensing information (e.g., EULAs or NDAs) as recognized by “license-ref-[idstring]” in https://spdx.github.io/spdx-spec/v2.3/other-licensing-information-detected/.
Describe the benefit to SPDX and its ecosystem
• Requirements for SBOMs are a feature in US EO14028, NTIAs SBOM Minimum Data Elements, CISAs standard form for SBOMs (Mar 14 2023), OMB M-22-18, SSDFv1.1/ SP800-218, SBOM requirements in other jurisdictions and other private/public measures. These requirements go beyond the simple standards of attribution documents practiced in the open source world by requiring identification of vendors, component relationships and other metadata associated with both open source and proprietary software components in products. Some of this information will be subject to contractual restrictions (e.g. non-disclosure agreements, EULAs, etc.) to protect information that needs to remain confidential/secret outside the mere identification of the open source package/license/copyrights.
• The proposed change will allow adopters of the SPDX format to prepare SPDX documents that correctly categorize such restrictions where they apply (SPDX recognized such special cases may exist in its 6.2 document) and honor obligations/restrictions received from their upstream vendors while clearly communicating such restrictions downstream to avoid accidental leaks/confusion.
• While HPE supports SPDX’s mission to promote open standards for communicating software bill of material information, there are certain circumstances where this is not desirable and the inability to designate an alternative license will damage credibility/trust in the provision of SPDX documents downstream. Inability for an SPDX validator to accept alternative license designations will also create friction in automation for organizations as they will not be able to seamlessly ingest such documents and will need to engage in additional tooling to automate inquiries or even manual effort to resolve such issues.
Scope of impact
This would require any update to https://github.com/spdx/spdx-3-model, for the section aligning with #62-data-license-field in the SPDXv2.3, as well as SPDX validation mechanisms which currently fail if an identifier other than CC0 is provided for the field “datalicense”.
Additional information
• This change proposal reopens spdx/spdx-spec#159 for discussion and acceptance.
• This change proposal also reiterates the concerns raised by Mark Atwood in Issue #850 (“I would like to reopen this issue. Amazon has severe reservations about being required to tag the SBOMs of our internal services and delivered products as CC0, even if there is also an NDA in place. We especially don't want to have "you put a CC0 on it" when someone else publishes something that was provided to them by someone breaking their NDA. The other SBOM standards do not require a CC0 or other license tag.”).
The text was updated successfully, but these errors were encountered: