Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF: Committee document missing "Reference number" #1246

Open
ronaldtse opened this issue Nov 15, 2024 · 10 comments
Open

PDF: Committee document missing "Reference number" #1246

ronaldtse opened this issue Nov 15, 2024 · 10 comments
Assignees
Labels

Comments

@ronaldtse
Copy link
Contributor

ronaldtse commented Nov 15, 2024

The Reference number is missing in PDF output:
image

The header:

= Description of Terminology Data Model v3
:docidentifier: ISO/IEC JTC 1/WG 15 N0255
:edition: 3
:revdate: 2023-07-21
:copyright-year: 2023
:language: en
:publisher: ISO;IEC
:title-main-en: Description of Terminology Data Model version 3
:title-main-fr: Description du modèle de données terminologiques version 3
:doctype: committee-document

The reference number is also missing from all the page headers, etc.

I tried the following variations but does not fix this problem at all:

Only docidentifier:

= Description of Terminology Data Model v3
:docidentifier: ISO/IEC JTC 1/WG 15 N0255

With tc-docnumber:

= Description of Terminology Data Model v3
:tc-docnumber: N0255
:docidentifier: ISO/IEC JTC 1/WG 15 N0255

With tc-docnumber and docnumber:

= Description of Terminology Data Model v3
:docnumber: N0255
:tc-docnumber: N0255
:docidentifier: ISO/IEC JTC 1/WG 15 N0255

They all give the same output.

Is this a Metanorma XML problem?

From this document:

The desired reference number is:
Screenshot 2024-11-15 at 11 41 37 AM

@ronaldtse ronaldtse added the bug label Nov 15, 2024
@github-project-automation github-project-automation bot moved this to 🆕 New in Metanorma Nov 15, 2024
@ronaldtse ronaldtse moved this from 🆕 New to 🌋 Urgent in Metanorma Nov 15, 2024
@Intelligent2013
Copy link
Contributor

For the https://github.com/metanorma/iso-iec-jtc1-wg15-tdm/, branch main, the PDF has the reference number:

image

I'll try the corrections branch.

@Intelligent2013
Copy link
Contributor

Confirmed, the PDF generated from the branch corrections doesn't have the reference number.
XSLT reads the reference number from iso-standard/bibdata/docidentifier[@type = 'iso-with-lang'].

The differences, left pane - branch correction, right pane - branch main:

  • document.adoc:
    image

  • document.presentation.xml:
    image

So, the most docidentifiers are missing in the Presentation XML after the changes.

ping @opoudjis.

Also, there are a lot differences in terms section:
image

@ronaldtse
Copy link
Contributor Author

ronaldtse commented Nov 15, 2024

@Intelligent2013 Ah, now you say this I realized the difference is this.

There are 2 numbers on the cover page:

Screenshot 2024-11-15 at 4 52 56 PM
  • 1 is the cover page PubID
  • 2 is the "reference number". It is also used on all the headers.

Only docnumber

= Title
:docnumber: 0255
...

=> 1 and 2 both present

Only tc-docnumber

= Title
:tc-docnumber: 0255

=> 1, 2 both missing

Only docidentifier

This is technically the preferred way as described in documentation, because we don't know what N number patterns people decide to use.

= Title
:docidentifier: ISO/IEC JTC 1/WG 15 N0255

=> 1 present, 2 missing

docnumber and docidentifier

= Title
:docnumber: 0255
:docidentifier: ISO/IEC JTC 1/WG 15 N0255

=> 1 present, 2 missing

tc-docnumber and docidentifier

= Title
:tc-docnumber: 0255
:docidentifier: ISO/IEC JTC 1/WG 15 N0255

=> 1 present, 2 missing

all 3 present

= Title
:docnumber: 0255
:tc-docnumber: 0255
:docidentifier: ISO/IEC JTC 1/WG 15 N0255

=> 1 present, 2 missing

Conclusion

This means when docidentifier is present, 1 is always missing.

@Intelligent2013
Copy link
Contributor

As

Also, there are a lot differences in terms section:

it seems some part of the code metanorma-iso or something isn't working properly when :docidentifier: is present.

@opoudjis
Copy link
Contributor

I'm having difficulty understanding what is being asked here.

:docidentifier: and :docnumber: are treated as mutually exclusive.

  • If :docnumber: is supplied, Metanorma tries to construct the identifier, using pubid; tc-docnumber is treated as a distinct identifier
  • If :docidentifier: is supplied, Metanorma takes it as the pre-formatted complete identifier, with an opaque structure. This is meant to be used when the document identifier cannot be reasonably generated based on docnumber and other attributes. Because Metanorma treats :docidentifier: as a pre-formatted opaque identifier, it does not understand its structure, and it therefore cannot construct the other identifiers.

So the behaviour you are reporting, @ronaldtse, is intentional, and desirable, and it is already documented in metanorma.org:

https://www.metanorma.org/author/ref/document-attributes/

:docidentifier:
As an alternative to docnumber and other attributes (such as doctype and docstage), which form the full identifier by combining multiple attributes, this attribute contains a full specification of the document identifier and overrides the composition of the document identifier [added in https://github.com/metanorma/metanorma-standoc/releases/tag/v2.3.9]. This value is used for document identifiers that do not follow normal SDO conventions, including for documents that are adoptions from other SDOs.

docidentifier should not be being used, because you normally don't want the document identifier to be opaque; if you can't get the identifier you want based on document attributes, that means you need to change pubid, so that you can.

it seems some part of the code metanorma-iso or something isn't working properly when :docidentifier: is present.

It's much simpler than that: :docidentifier: is semantically opaque, and Metanorma accordingly right now does not attempt to parse it into components. I'm a bit surprised by that, I thought I would have tried to parse it anyway, but I can see why I didn't. Because it isn't parsed into components, the citation of the document's identifier in the termnote (breaking it down into publisher, number, part number, etc) doesn't happen.

I don't think the right answer is to try to infer the internal structure of :docidentifier:. I think the right answer is to use docnumber and tc-docnumber wheresoever possible, so that it can do the breakdown. I don't see how the ISO rationale

This is technically the preferred way as described in documentation, because we don't know what N number patterns people decide to use.

makes sense: tc-docnumber is already preformatted, so if you have the N-number, you already know what N number pattern has been used.

@Intelligent2013
Copy link
Contributor

@opoudjis thank you for the clarification.

  • If :docidentifier: is supplied, .... and it therefore cannot construct the other identifiers.

These docidentifiers are using widely in the ISO XSLT:

  • docidentifier[@type = 'iso-undated']
  • docidentifier[@type = 'iso-with-lang']
  • docidentifier[@type = 'iso-reference']

It means I have to fallback to <docidentifier type="ISO" if the value is missing.

@Intelligent2013
Copy link
Contributor

It means I have to fallback to <docidentifier type="ISO" if the value is missing.

ISO XSLT updated.

@ronaldtse
Copy link
Contributor Author

:docidentifier: and :docnumber: are treated as mutually exclusive.

I don’t see this being documented anywhere. I would copy what you wrote and provide that on the said page. The information about the N-number is also undocumented.

@ronaldtse
Copy link
Contributor Author

What you are saying is that this is mutually exclusive:

  • option 1: entire provider docnumber or tc-docnumber
  • option 2: use docidentifier

And the XML describing the identifier(s) should always be identical across both cases unless in the case where there is no parsing.

@opoudjis
Copy link
Contributor

:docidentifier: and :docnumber: are treated as mutually exclusive.

I don’t see this being documented anywhere. I would copy what you wrote and provide that on the said page. The information about the N-number is also undocumented.

I LITERALLY just copy-pasted the documentation for this:

:docidentifier:
As an alternative to docnumber and other attributes (such as doctype and docstage), which form the full identifier by combining multiple attributes, this attribute contains a full specification of the document identifier and overrides the composition of the document identifier [added in https://github.com/metanorma/metanorma-standoc/releases/tag/v2.3.9]. This value is used for document identifiers that do not follow normal SDO conventions, including for documents that are adoptions from other SDOs.

overrides the composition of the document identifier

I will spell it even more.

The information about the N-number is also undocumented.

No it is not:

https://www.metanorma.org/author/iso/ref/document-attributes/

:tc-docnumber:
The document number assigned by a distribution group (also called the “N-document number” or the “N-number”), typically a Technical Committee, a Subcommittee or a Working Group. Must include the short reference of the distribution group, since documents may circulate widely;

Example 18. Setting the N-document number for a distribution group
For a document circulated in ISO/TC 154 as "N 1218" (instead of "N 1218"):

:tc-docnumber: ISO/TC 154 N 1218

I assume that after I update the documentation, there is nothing more for me to do here, and that this ticket can be removed from my urgent list and closed.

opoudjis added a commit to metanorma/metanorma.org that referenced this issue Dec 13, 2024
@opoudjis opoudjis moved this from 🌋 Urgent to 👀 In review in Metanorma Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 👀 In review
Development

No branches or pull requests

3 participants