Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give warnings for content files with language codes not matching dc:language in package.opf #503

Open
martinpub opened this issue Oct 13, 2021 · 2 comments
Labels
enhancement validator-revision EPUB 3 / HTML Validator revision: 2020-1

Comments

@martinpub
Copy link
Collaborator

Even though whole content files could use another language code than the main language specified in package.opf, it would be convenient if the validator would flag if there is a mismatch between a content file and dc:language.

In more technical terms, if the content file's xml:lang and lang attributes on the root element do not match the value in dc:language.

This would avoid mistakes where e.g. all content files are in one language and it is not the language defined in dc:language.

@martinpub martinpub added enhancement validator-revision EPUB 3 / HTML Validator revision: 2020-1 labels Oct 13, 2021
@josteinaj
Copy link
Member

josteinaj commented Oct 17, 2021

How do we handle multi-language books? In our library system (and presumably most library systems), you can mark the book as having multiple languages with the ISO 639-2 code "mul", and then list all languages in a separate field. In EPUB 3 terms, you can (as far as I know) have multiple dc:languages, but you can only have one xml:lang/lang attribute. So should we use mul for xml:lang/lang in these cases? And if so, validate that there must be at least two dc:languages?

I suppose whether or not to allow the mul code in xml:lang/lang could be an issue of its own…

@martinpub
Copy link
Collaborator Author

martinpub commented Oct 18, 2021

Good point @josteinaj. I thought that cataloguing rules prescribed using one primary language only, I need to verify this with our cataloguing expert.

Update: Checked with our cataloguing expert, and yes, the current limitation was not in cataloguing but in our internal production system, which does not allow multiple values for language. The cataloguing rules we use in Libris (the National Library's catalogue), RDA, prescribe 1-6 languages recorded specifically. If the number of languages >6, then using "mul" is suggested, but I think the latter is national praxis only, not RDA.

Need to get back on this one with more thoughts. Tbc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement validator-revision EPUB 3 / HTML Validator revision: 2020-1
Projects
None yet
Development

No branches or pull requests

2 participants