Information content of the abstract manifest #12

dauwhe · 2017-06-27T14:33:07Z

What information is required for an abstract manifest? [edited to add items from comments]

An identifier for the web publication, which should be a URL
Some way of saying that this URL represents a web publication.
Some way of identifying the constituent resources of the web publication.
Some way of providing a preferred order of (some of) the constituent resources in case there is more than one
Some way of being able to add more complex metadata to a publication. (Not clear to my mind whether we would define a minimally required set of metadata, but the slot should be there.)
Locating table of contents or other navigation structure

What else? I think we should distinguish required information from "nice to have" information.

GarthConboy · 2017-06-27T14:56:04Z

I'd also throw in:

-- Reading order
-- Basic metadata (yes, a can of worms we'll need to open)

Re the #1 and #2 just above in Dave's original issue, it seems they may want to be pre-manifest -- defined before the manifest is found, or be the actual path to the manifest (or to a "first file" that can be rendered, but also somehow points to the manifest).

iherman · 2017-06-27T14:56:26Z

Some way of providing a preferred order of (some of) the constituent resources in case there is more than one
Some way of being able to add more complex metadata to a publication. (Not clear to my mind whether we would define a minimally required set of metadata, but the slot should be there.)

iherman · 2017-06-27T14:56:58Z

(Wow. I just said the same thing as Garth just in other words. I swear we did not conspire...)

mattgarrish · 2017-06-27T15:54:30Z

What is meant by required here? Must always be present or must be accounted for in the design? This is why I wasn't sure at the f2f if navigation constituted a top-level or lower-level consideration.

A standardized means of locating the table of contents seems critical to me, even if it's optional to define and there are no epub-like rules on its construction.

GarthConboy · 2017-06-28T16:02:00Z

The updated #6 in the first panel says "Locating table of contents or other navigation structure", we should also consider:

-- Do we need such a Nav file (likely yes for A11Y)
-- Should it be in the Manifest or pointed-to by the Manifest (I could see an argument for all eggs in one basket -- though the machine readable or renderable discussion will arise)

dauwhe · 2017-06-28T16:35:55Z

Do we need such a Nav file (likely yes for A11Y)

See #14

Should it be in the Manifest or pointed-to by the Manifest

Interesting question. I know Hadrien has proposed including section titles in a JSON manifest, but I have major concerns about possible reader-facing text in JSON (especially given that there's a standard html way to do this stuff).

HadrienGardeur · 2017-07-02T20:27:05Z

I know Hadrien has proposed including section titles in a JSON manifest, but I have major concerns about possible reader-facing text in JSON (especially given that there's a standard html way to do this stuff).

IMO the Navigation Document in EPUB 3 is a failed experiment. Most EPUB 3 documents that I've seen end up including at least two HTML table of contents:

a nice looking one, included in the spine and not marked as being a Navigation Document
a basic one, used as the Navigation Document

Most EPUB 3 reading systems do not render these Navigation Documents either, they simply parse them, extract the info and display things using their own UI.

This is a typical example of "spec purity" (the beauty of the Navigation Document) vs real world usage (no one is rendering these documents and we end up with more redundancy instead of less).

Readium (1, JS and 2) ended up parsing the info in the Navigation Document and providing a JSON output instead, which is much easier for developers to work with.

In the Readium Web Publication Manifest:

there is absolutely zero requirement for a table of contents (I strongly believe that we shouldn't force a ToC on single resource publications that won't need one)
all the different ToC types that exist in EPUB are parsed (NCX, landmarks in OPF and Navigation Document) and exposed in a consistent way (collections) in the manifest
we also keep links to the Navigation Document in spine or resources and identify them as such using a rel value

HadrienGardeur · 2017-07-02T20:35:06Z

To go back to the initial question, in Readium we separate clearly the abstract model with the minimal requirements for a manifest.

The abstract model has three core concepts:

metadata (based on JSON-LD)
links
collections (identified by a role, can aggregate metadata, links and other collections)

For each core concept, we make sure that:

the requirements are very basic
the model is flexible and powerful enough to allow the expression of complex use cases
a number of extensibility points are available and clearly identified

The basic requirements for a manifest are then based on that model:

a manifest should at least contain a title in its metadata
it should at least contain a link to itself, identified by the self relation
it should contain at least one resource in its spine collection, which contains the key resources for a publication in reading order

llemeurfr · 2017-07-03T12:43:14Z

An identifier for the web publication, which should be a URL

Better, an IRI because a) may be a urn (up to the publisher to choose, the Web doesn't care) and b) i18n is important. A URL to the origin is also important but should be another property.

WSchindler · 2017-07-03T14:47:55Z

I would like to add:
7. language(s) used in the WP - the plural is due to the fact that we will have publications such as parallel texts (original + one or more translations), bilingual dictionaries which contain 1-n languages . The language used has also implications for rendering (e.g. "ltr" vs "rtl", vertical layout)

HadrienGardeur · 2017-07-03T14:50:29Z

Language and direction (ltr vs rtl) should be two separate metadata. Agree that we need to allow more than one language.

lrosenthol · 2017-07-03T22:05:51Z

If we plan to use anything other than a URL (as defined by the HTML spec - https://www.w3.org/TR/WD-html40-970917/htmlweb.html), then we are going to need to be willing to jump into the current battle between the W3C and the IETF on the definition of URL/URI/IRI etc. Here is an old blog entry about it - http://intertwingly.net/blog/2014/10/02/WHATWG-URL-vs-IETF-URI

…

On Mon, Jul 3, 2017 at 8:43 AM, L. Le Meur ***@***.***> wrote: An identifier for the web publication, which should be a URL Better, an IRI because a) may be a urn (up to the publisher to choose, the Web doesn't care) and b) i18n is important. A URL to the origin is also important but should be another property. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#12 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AE1vNUBV20dmP2MLDyjT0lS3eVlEeU8gks5sKOHjgaJpZM4OGuBw> .

llemeurfr · 2017-07-05T10:22:14Z

Re. URL vs IRI, after reading https://www.w3.org/International/wiki/IRIStatus, I must admit that this seems like a can of dirty warms. Apart from trying to allow for an extended i18n of publication identifiers, there is still the question of URNs allowed or not as global identifiers. For instance, I spotted that most @HadrienGardeur's Manifest samples use isbn urns as identifiers.

HadrienGardeur · 2017-07-05T12:47:08Z

@llemeurfr you're mixing up two different concept regarding the Readium Web Publication Manifest.

Keep in mind that we started this work in the context of BFF and that for Readium-2 we mostly ingest EPUB files.

The only requirement in the draft document for the Readium WebPub Manifest is to always provide a self link. In the context of a Web Publication it makes perfect sense: if a publications lives on the Web, we need a URL that can point to its manifest.

Here's a basic example using the Readium WebPub Manifest model:

"@context": "http://readium.org/webpub/default.jsonld",
"metadata": {
  "title": "The Master and Margarita"
},
"links": [
  {"rel": "self", "href": "http://example.com/manifest.json", "type": "application/webpub+json"}
],
"spine": [
  {"href": "http://example.com/chapter1", "type": "text/html"}
]

If the publication has an additional identifier, this can be provided in its metadata:

"metadata": {
  "title": "The Master and Margarita",
  "identifier": "urn:isbn:9780141180144"
}

That second identifier is not a requirement in the Readium model, and we can't expect all Web Publications to have such an identifier either.

The reason why most of our current samples have URNs (mostly for ISBNs or UUIDs) is because we ingest EPUB files or provide samples for books where ISBNs are very common.

dauwhe · 2017-07-05T13:01:48Z

I would like to add:
7. language(s) used in the WP - the plural is due to the fact that we will have publications such as parallel texts (original + one or more translations), bilingual dictionaries which contain 1-n languages . The language used has also implications for rendering (e.g. "ltr" vs "rtl", vertical layout)

My only concern is that HTML already has mechanisms for describing the language(s) of content. What happens when a user agent opens an HTML page declared with language A, finds a rel=manifest link, follows it, and sees language B declared?

HadrienGardeur · 2017-07-05T13:11:37Z

My only concern is that HTML already has mechanisms for describing the language(s) of content. What happens when a user agent opens an HTML page declared with language A, finds a rel=manifest link, follows it, and sees language B declared?

The manifest declares the language for the publication, while HTML is meant to declare the language for that resource.
The UA would simply set the default to language B but override that option with language A as it displays or interacts with that HTML page.

llemeurfr · 2017-07-05T14:04:49Z

you're mixing up two different concept regarding the Readium Web Publication Manifest.

That's right. If a Web publication is copied to another website, this value will not be modified. Therefore a possible definition of the self link is "The original location of the Web Publication", which can be aligned with Requirement 8 for Web Publications: "There should be a way to uniquely identify a Web Publication."

HadrienGardeur · 2017-07-05T14:10:56Z

From RFC5988:

o Relation Name: self
o Description: Conveys an identifier for the link's context.
o Reference: [RFC4287]

WSchindler · 2017-07-05T15:36:10Z

It's of course true that via @lang or @xml:lang, you may define the language(s) used in your HTML. I still think that the point of entry for a UA consuming a WP would be the manifest where it would be helpful to find an information on the languages used in the WP. If you have a Chinese-English dictionary, it is IMO no trivial task to prepare the rendering.

lrosenthol · 2017-07-05T16:15:30Z

Actually, I would expect the UA to completely ignore the language settings (A, in this case) in the manifest - and only concern itself with the actual resource being processed/rendered (B, in this case). The language (or languages) in the manifest have no bearing on the actual content - they are (IMO) informational only.

…

On Wed, Jul 5, 2017 at 9:11 AM, Hadrien Gardeur ***@***.***> wrote: My only concern is that HTML already has mechanisms for describing the language(s) of content. What happens when a user agent opens an HTML page declared with language A, finds a rel=manifest link, follows it, and sees language B declared? The manifest declares the language for the publication, while HTML is meant to declare the language for that resource. The UA would simply set the default to language B but override that option with language A as it displays or interacts with that HTML page. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#12 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AE1vNbw7uxWapNOfZZN7r09Gmn2AxeqPks5sK4uKgaJpZM4OGuBw> .

lrosenthol · 2017-07-05T16:16:20Z

If a Web publication is copied to another website, this value will not be modified

That's not necessary true. The new site may well change the link(s) in the manifest. There is nothing about it that is "off limits" - certainly not in a WP, and possibly not even in a PWP.

…

On Wed, Jul 5, 2017 at 10:04 AM, L. Le Meur ***@***.***> wrote: you're mixing up two different concept regarding the Readium Web Publication Manifest. That's right. If a Web publication is copied to another website, this value will not be modified. Therefore a possible definition of the self link is "The original location of the Web Publication", which can be aligned with Requirement 8 for Web Publications: "There should be a way to uniquely identify a Web Publication." — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#12 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AE1vNRbejRAPPpj2OsrzKSZptKCwspLPks5sK5gCgaJpZM4OGuBw> .

HadrienGardeur · 2017-07-05T16:21:20Z

Actually, I would expect the UA to completely ignore the language settings (A, in this case) in the manifest - and only concern itself with the actual resource being processed/rendered (B, in this case). The language (or languages) in the manifest have no bearing on the actual content - they are
(IMO) informational only.

While rendering content, sure I fully agree. But a UA can provide additional services on top of it, for example dictionaries or search. The publication metadata can be useful in that regard.

mattgarrish · 2017-07-05T16:21:38Z

I would expect the UA to completely ignore the language settings
(A, in this case) in the manifest

I agree it's informative and must not be used for rendering content (or metadata), but the same question about value has been raised in epub revisions and the case has been made that it does have uses (e.g., pre-loading tts languages, offering access to dictionaries, etc.).

lrosenthol · 2017-07-05T16:23:26Z

On Wed, Jul 5, 2017 at 12:21 PM, Hadrien Gardeur ***@***.***> wrote: But a UA can provide additional services on top of it, for example dictionaries or search. The publication metadata can be useful in that regard.

It could indeed be useful - and whether a UA chooses to use it for that or not is (IMO) out of scope for our work.

HadrienGardeur · 2017-07-05T16:24:54Z

It could indeed be useful - and whether a UA chooses to use it for that or
not is (IMO) out of scope for our work.

Defining the UA behavior is out of scope, but making sure that it has relevant info needed is definitely within scope.

dauwhe · 2017-07-05T17:03:05Z

This issue was moved to w3c/wpub#6

dauwhe added Manifest labels Jun 27, 2017

dauwhe mentioned this issue Jul 5, 2017

Information content of the abstract manifest w3c/wpub#6

Closed

dauwhe closed this as completed Jul 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Information content of the abstract manifest #12

Information content of the abstract manifest #12

dauwhe commented Jun 27, 2017 •

edited

Loading

GarthConboy commented Jun 27, 2017

iherman commented Jun 27, 2017

iherman commented Jun 27, 2017

mattgarrish commented Jun 27, 2017

GarthConboy commented Jun 28, 2017

dauwhe commented Jun 28, 2017

HadrienGardeur commented Jul 2, 2017

HadrienGardeur commented Jul 2, 2017

llemeurfr commented Jul 3, 2017 •

edited

Loading

WSchindler commented Jul 3, 2017

HadrienGardeur commented Jul 3, 2017

lrosenthol commented Jul 3, 2017 via email

llemeurfr commented Jul 5, 2017

HadrienGardeur commented Jul 5, 2017 •

edited

Loading

dauwhe commented Jul 5, 2017

HadrienGardeur commented Jul 5, 2017

llemeurfr commented Jul 5, 2017

HadrienGardeur commented Jul 5, 2017

WSchindler commented Jul 5, 2017

lrosenthol commented Jul 5, 2017 via email

lrosenthol commented Jul 5, 2017 via email •

edited by dauwhe

Loading

HadrienGardeur commented Jul 5, 2017

mattgarrish commented Jul 5, 2017

lrosenthol commented Jul 5, 2017 via email

HadrienGardeur commented Jul 5, 2017

dauwhe commented Jul 5, 2017

Information content of the abstract manifest #12

Information content of the abstract manifest #12

Comments

dauwhe commented Jun 27, 2017 • edited Loading

GarthConboy commented Jun 27, 2017

iherman commented Jun 27, 2017

iherman commented Jun 27, 2017

mattgarrish commented Jun 27, 2017

GarthConboy commented Jun 28, 2017

dauwhe commented Jun 28, 2017

HadrienGardeur commented Jul 2, 2017

HadrienGardeur commented Jul 2, 2017

llemeurfr commented Jul 3, 2017 • edited Loading

WSchindler commented Jul 3, 2017

HadrienGardeur commented Jul 3, 2017

lrosenthol commented Jul 3, 2017 via email

llemeurfr commented Jul 5, 2017

HadrienGardeur commented Jul 5, 2017 • edited Loading

dauwhe commented Jul 5, 2017

HadrienGardeur commented Jul 5, 2017

llemeurfr commented Jul 5, 2017

HadrienGardeur commented Jul 5, 2017

WSchindler commented Jul 5, 2017

lrosenthol commented Jul 5, 2017 via email

lrosenthol commented Jul 5, 2017 via email • edited by dauwhe Loading

HadrienGardeur commented Jul 5, 2017

mattgarrish commented Jul 5, 2017

lrosenthol commented Jul 5, 2017 via email

HadrienGardeur commented Jul 5, 2017

dauwhe commented Jul 5, 2017

dauwhe commented Jun 27, 2017 •

edited

Loading

llemeurfr commented Jul 3, 2017 •

edited

Loading

HadrienGardeur commented Jul 5, 2017 •

edited

Loading

lrosenthol commented Jul 5, 2017 via email •

edited by dauwhe

Loading