Add standard compliant default identifier #21
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
To create standard-complying PDFs with an identifier, the current identifier option is not enough. The standard reads:
(emphasis mine)
The second value of the ID array is always set to a hash of the document's objects. This is fine. But it's currently impossible to set the first value accordingly, because it's just not known before I create the document.
This PR creates an identifier by default, when identifier=None. It uses the same hash as the first component, as mandated by the spec. To create a new revision of the same document, the user can then take this ID from the original revision and pass it to the
identifier
argument - this will then create a new revision with proper IDs.This will always create documents with identifiers, even though the ID in general is optional according to spec. I don't think that's a bad thing though. This goes in line with what is asked for here: Kozea/WeasyPrint#1661 - PDF/A compliance by default.