Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValidateDocument() fails for official SPDX Spec 2.2/2.3 examples #231

Open
wallrat opened this issue Nov 10, 2023 · 2 comments
Open

ValidateDocument() fails for official SPDX Spec 2.2/2.3 examples #231

wallrat opened this issue Nov 10, 2023 · 2 comments

Comments

@wallrat
Copy link

wallrat commented Nov 10, 2023

The following files from the spdx-spec repository fails to validate

https://github.com/spdx/spdx-spec/blob/development/v2.3.1/examples/SPDXJSONExample-v2.3.spdx.json
https://github.com/spdx/spdx-spec/blob/development/v2.3/examples/SPDXJSONExample-v2.2.spdx.json

(only tested the JSON versions sofar)

ValidateDocument() returns err ToolsElement used in relationship but no such package exists

https://tools.spdx.org/app/validate/ (based on the Java lib?) validates both fine.

From the 2.3 BOM:

...
  "externalDocumentRefs": [
    {
      "externalDocumentId": "DocumentRef-spdx-tool-1.2",
      "checksum": {
        "algorithm": "SHA1",
        "checksumValue": "d6a770ba38583ed4bb4525bd96e50461655d2759"
      },
      "spdxDocument": "http://spdx.org/spdxdocs/spdx-tools-v1.2-3F2504E0-4F89-41D3-9A0C-0305E82C3301"
    }
  ],

...

  "relationships": [
... 
    {
      "spdxElementId": "SPDXRef-DOCUMENT",
      "relationshipType": "COPY_OF",
      "relatedSpdxElement": "DocumentRef-spdx-tool-1.2:SPDXRef-ToolsElement"
    },
   ... 
]

Not enough of an expert on the spec to determine who is in the wrong here. Shouldn't ValidateDocument() take external refs into account? It currently only checks for refs to packages, and files.

@swinslow
Copy link
Member

Hi @wallrat,

You may have seen that I asked about this in the main SPDX 2.x spec repo, at spdx/spdx-spec#870.

Based on the discussion there, it sounds like "validation" for purposes of SPDX tooling generally has looked only at the current SPDX Document, not at e.g. any other Documents that are referenced via an External Document Reference.

So here, the SPDX 2.2/2.3 example documents reference an external DocumentRef-spdx-tool-1.2 Document, which is presumed to exist but is not available (and doesn't in fact actually exist). It also has a Relationship that points to an SPDX Element in that external Document, DocumentRef-spdx-tool-1.2:SPDXRef-ToolsElement.

Based on spdx/spdx-spec#870, it sounds like the presence of that external document and/or the SPDXRef-ToolsElement in it do not need to be checked for the example Document to be considered "valid." All that is checked is that the syntax of that identifier is validly formatted.

For the Golang tools purposes, the developers could consider whether to e.g. allow different levels of validation -- perhaps a "strict" validation that actually checks every referenced Document, vs. a "default" validation that doesn't look into nested external Documents.

@wallrat
Copy link
Author

wallrat commented Nov 28, 2023

As a tooling vendor1 building on top of SPDX, and multiple other formats, I'm mostly concerned with end-user expectations - which usually means to be very forgiving with what we accept and give sane feedback on bad input.

With that in mind, we usually consider a few levels of 'validity':

  1. Is the document 'well-formed' ? This most often means that the document is parsable (syntactically correct) and some set of minimal of information can be meaningfully extracted. For us, failing this level the document is rejected as 'bad input'.
  2. Is the document 'valid' according to the documents declared spec version and format? Here we expect the document to contain all required fields and ids, document internal references to resolve correctly etc. My experience is that this is usually what end-users means with 'valid' and the level we expect all tools to produce (or they are considered broken). For us, failing this level usually means that we try our best to process the document, but let the end-user now about any errors found.
  3. Do the document contents make sense within a larger context? This means checking if external references resolve to existing resources etc. Note this is not a pure function of the input - if the document was valid at one point in time doesn't mean it's necessarily valid now.

What I would expect from a library is to cover 1 and 2. Looking at external resources is way too use-case dependent.

From a developer perspective I would definitely expect ValidateDocument() to accept all examples from the spec repo to pass (and be part of the library's test suite).

(as a side note the Go library also fails to parse the XML/RDF example but at the XML layer. I will open a separate issue for that)

Footnotes

  1. https://sbom.observer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants