-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimum Viable Manifest #15
Comments
In Readium Web Publication Manifest, we consider that a minimum viable manifest has:
Its identification as a WP manifest is handled by a dedicated media type ( I know that @lrosenthol has challenged the requirements for both a title and a reading order, but I still feel that this is a good starting point. For the other things that you've mentioned, I'd say:
IMO none of these other elements are in a minimum viable manifest. |
From @TzviyaSiegman
replace identifier by URL and I agree. A manifest, de facto, has a URL. From @HadrienGardeur / Readium2
at some point I suppose we'll need to discuss Also, is the manifest's URL could be interpreted as the canonical locator by default if
Right. This needs to be discussed, probably in a separate thread. |
As a comparison, note that the manifest object in Web App Manifest do not have an identifier member. But it does have a URL of course:
|
Is this issue intended provide fine details of manifest? Or is it |
"Default" reading order should be essential. |
@avneeshsingh wrote
Even though I'd consider reading order information important to a publication, I don't think it should be required in a manifest; e.g., it seems sufficient if user agents render the the first (primary) resource when they can't find a reading order (whatever it may be). |
To answer your questions/points @rdeltour
But the same author (Mark Nottingham) provides an updated definition in Web Linking (RFC 5988, the spec that introduces the Link header to HTTP):
On the other hand,
In our case I would argue that from the manifest itself,
It could, but since the manifest will definitely be distributed in various ways where you won't be dealing with HTTP (for example when the manifest is in a package), you must have another way to provide that URL. |
@pkra wrote:
IMO the list of primary resources and the reading order should be the same thing. |
“list of primary resources and the reading order should be the same thing.”
So, the order of listing the primary resources becomes the reading order. This is worth considering.
However there are complexities. e.g. if 10 html pages are primary resources and 5 of these pages has audio player in it, pointing to mp3 file. Then we will not place this MP3 file as the primary resource?
|
@HadrienGardeur wrote
Interesting thought. I guess I'm not clear on the distinction of primary and secondary. But that's a separate issue. |
I agree with @HadrienGardeur <https://github.com/hadriengardeur> that the
primary resources and the reading order are one and the same. I've been
trying to come up with an example of a primary resource (as we've been
thinking about it) that wouldn't be in the default reading order, and I
haven't found one as yet.
I am not clear what the difference is between "navigation" and the "default
reading order" - I see them as exactly the same thing. So given the DRO
and primary resource alignment above, then we can also merge nav into that
as well and kill three birds.
I would group all metadata together (including title and lang) and make it
optional. So far, no metadata has been identified as being required for a
WP. However, I do believe that as we move to PWP and EPUB4, there will be
fields that we will want/need required.
Secondary resource listing gets into some significant complexities (esp.
with relative URL resolution) when you consider the publication
editing/updating. I would like to avoid this at all costs.
…On Fri, Aug 4, 2017 at 6:42 AM, Peter Krautzberger ***@***.*** > wrote:
@HadrienGardeur <https://github.com/hadriengardeur> wrote
IMO the list of primary resources and the reading order should be the same
thing.
Interesting thought. I guess I'm not clear on the distinction of primary
and secondary. But that's a separate issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AE1vNawuUBvNG-zKiPDpbQpMeLTQuVy4ks5sUvWLgaJpZM4OsseF>
.
|
There's more to navigation than just the reading order: navigation is also about ToC (deep links to the content), or page lists, or landmarks, etc. |
In general I agree, but where does that place "non-linear" resources? If a resource opens in a new window, is it primary or ... ? I'm not fully comfortable with the definitions we have, even if they'll do for now. But this is another issue. |
On Fri, Aug 4, 2017 at 8:56 AM, Matt Garrish ***@***.***> wrote:
IMO the list of primary resources and the reading order should be the same
thing.
In general I agree, but where does that place "non-linear" resources?
What's a non-linear, primary, resource?
If a resource opens in a new window, is it primary or ... ?
For the case of WP, I don't think it matters or that we care. That's up
to the author and out of scope.
For PWP, there are significant security concerns with this which we'll
address when we get there.
|
On Fri, Aug 4, 2017 at 8:19 AM, Romain Deltour ***@***.***> wrote:
I am not clear what the difference is between "navigation" and the
"default reading order" - I see them as exactly the same thing.
There's more to navigation than just the reading order: navigation is also
about ToC (deep links to the content), or page lists, or landmarks, etc.
Thanks. Those are definitely non MVP then, since many publications won't
have them.
But the details s/b discussed in a separate issue, as it may not belong to
the minimum viable manifest.
Yes, sounds like we need to have discussions about this navigation
stuff...and also how much is at the WP level...
|
Navigation and default reading order are surely different. And default reading order cannot even satisfy the essential part, hierarchy, unless there are strict conditions applied on content documents. |
@murata0204
This issue is intended to create a working version of the manifest for FPWD. We will (necessarily) refine is as the details of WP emerge. |
I think the minimum viable manifest would include:
Lots of other things are nice to have, but it's not impossible to imagine life without them. A web manifest would have a URL, which could serve to identify the web publication. For some cases, perhaps nothing more is needed. |
I still continue to object to 2 - titles are optional.
Otherwise, I agree that 1 & 3 are MVP.
…On Mon, Aug 7, 2017 at 11:40 AM, Dave Cramer ***@***.***> wrote:
I think the minimum viable manifest would include:
1. A list of the primary document resources, in default order
2. a title
3. some way of declaring that this is a web publication
Lots of other things are nice to have, but it's not impossible to imagine
life without them. A web manifest would have a URL, which could serve to
identify the web publication. For some cases, perhaps nothing more is
needed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AE1vNYMWbM_cC3nQB6ih9JR5GfGDTqIvks5sVy_ggaJpZM4OsseF>
.
|
I still feel that 3 (some way of declaring that this is a web publication) is implicit as long as we have a dedicated media type. |
The minimal file that passes HTML5 validation is For something so fundamental, and so intrinsic to how humans talk about the world, I would want a fairly compelling reason why a web publication shouldn't need a title. Or, to put it another way, it's easy to imagine untitled documents, but an untitled publication is not yet ready to publish. |
Accessibility also requires as a title. (WCAG 2.4.2) |
I'm wondering about our use cases here:
Since having a locator is a requirement, if we consider that these are valid use cases, then we also need a locator for the manifest in the minimum viable manifest. |
On Mon, Aug 7, 2017 at 12:21 PM, Dave Cramer ***@***.***> wrote:
I still continue to object to 2 - titles are optional
The minimal file that passes HTML5 validation is <!doctype
html><title>Hello world</title>. EPUB requires a title. WCAG single A
requires titles for web pages. Docbook requires a title. Saving something
to the homescreen (a la web app manifest) requires a name.
Yes, but the title could be empty and still be valid. So what's the point
of having it?
I say this from experience with PDF/X where that was exactly what folks did
- put in blank titles. So we removed that requirement in later versions of
the standard.
Or, to put it another way, it's easy to imagine untitled documents, but an
untitled publication is not yet ready to publish.
I disagree. Most documents are actually untitled.
|
@lrosenthol actually, HTML5 requires a title not be blank. Try the validator with Also, @dauwhe's point was that documents are often untitled, but publications only lack titles when unpublished--which I think is a safe assertion. Just clarifying. 👓 |
That's actually agreeing with @dauwhe, who distinguished between "document" and "publication" as a prescriptive definition. |
Regarding the list of primary resources (thank you, @murata0204 for noting that, I had missed it), I think we need to list all the resources that could be used by the publication IF we want to allow packaging of arbitrary WPs. Otherwise it is not just difficult to package, it appears to be impossible in the presence of scripts. |
On Mon, Aug 7, 2017 at 12:48 PM, BigBlueHat ***@***.***> wrote:
@lrosenthol <https://github.com/lrosenthol> actually, HTML5 *requires* a
title not be blank. Try the validator
<https://validator.w3.org/nu/#textarea> with <!doctype html><title>
</title> (or any other variation of "empty"). Without a meaningful title,
it's not an HTML document.
That's fine - but that title is on a single content element. It has
nothing to do with the "publication" itself.
Also, as I noted, validation isn't useful for *ad-hoc* publications since
users won't validate things...
Also, @dauwhe <https://github.com/dauwhe>'s point was that *documents*
are often untitled, but publications only lack titles when
unpublished--which I think is a safe assertion.
For formal publications - I agree. For *ad-hoc* publications, no it is
not.
Which is why I think that something like title an be a requirement for
something like EPUB4 (which is more targeted to formal publications - at
least if it keeps as EPUB3 is) but not for the generic WP.
|
That's not going to be a MUST in practice no matter how hard we insist on it in the spec. Authors are going to have resources they consider to be a part of the publication but which they won't list, either because they can't or because they don't want them to be available offline. If we require as a 'must' listing all resources an author considers to be a part of the publication—while knowing the near certain reality that it won't be treated as such in practice—then we severely undermine the credibility of the spec as a standard. 'Web publication' then becomes something that's largely defined by the undocumented specifics of each implementation because the implementations certainly won't be following the actual spec. |
Saying 'all the resources the author wants to be made available offline MUST be listed in the manifest' is a much more realistic statement than 'all resources the author considers part of the publication MUST be listed in the manifest'. |
We're defining a minimum viable manifest, and based on that sentence you agree that secondary resources are not required. Great, we can move on the the next point. |
@HadrienGardeur Cool! I am totally happy with: All primary and secondary resources MUST be listed in the manifest. I could also go for: All publication resources MUST be listed in the manifest. I thought there was disagreement about that, but happy there isn't! Let's move on. |
WCAG specifies the following: 2.4.2 Page Titled: Web pages have descriptive titles 2.4.5 Multiple Ways: More than one way is available to locate content within a set of Web pages Without title and navigation we are leading towards inaccessible publications. |
I may have a different thing in mind for what a minimal (abstract) manifest is. And if I am alone with this, I will shut up:-) My mental model as akin to the XML Infoset standard (much as XML is out of fashion these days): that standard defines what type of information items are available for an XML processor in the abstract sense, regardless of how that is defined, serialized, etc, in practice. The same here: what we have to decide what are the information items that a WP user agent must have and should have to properly handle a WP. How that information is provided, what is encoded directly in JSON or, God forbid, in XML is a different matter, but having this abstract set as a guiding principle is important. It is interesting to see that the resolution the "title" issue converging towards makes my point:
What this means in my thinking is that
And I am not sure we do agree about the information items that must be available. Take the language. My point is: for a publication a language information MUST be available. (Think of the fact that a proper rendering of a text, if we want a publication to be rendered according to the typesetting traditions of a local culture, relies on that information). That being said, I do not think an concrete manifest MUST include a language tag, recognizing the fact that the language tag may be deduced from the primary resources (just like the title) and it even has a well define default. But, again: my view of what a minimal viable manifest is may be different from others'. |
@iherman I was told during the call that we're not discussing an abstract manifest anymore, so I'm a little lost here... But going back to @TzviyaSiegman initial list, here's my list of requirements based on what you've said:
The following points are IMO just SHOULD and not requirements: A WP SHOULD contain a title, which defaults to the title of the first HTML primary resource Since WP should IMO support audiobooks and comics, we can't expect to always have an HTML primary resource. A WP SHOULD contain navigation, which defaults to extracting navigation from HTML documents listed as primary resources For the same reason, while navigation can default to extracting an outline from HTML documents, we can't always expect the presence of HTML which means that navigation cannot be a requirement. A WP SHOULD contain a language For the language, HTML documents already have their own default and we don't really need a language for audio/video/images that might be listed as primary resources. |
As I said, if this is the group consensus, I shut up. (I do not remember this from yesterday, but that may be my fault).
I am fine saying it MUST but isn't it so that if it is on the Web it has an HTTP(S) locator? What other ones do you have in mind?
Disagree. A locator is not necessarily an identifier; many locators are used in a way that if the WP is moved, the locator of course changes (ie, no redirection is necessarily set up). One of the important aspect of identifiers is its immutability. I think we should stick with the SHOULD as descrined yesterday.
Must identify itself: yes. But whether we will use media types or not, not sure. (One of the problems I foresee is that it is not possible to define a three level hierarchy in media types. Ie, we can say json+manifest but we cannot say json+ld+manifest if we go down that line. I would leave this open at this point.)
I am personally fine with this, although I do not believe there is a consensus in the group.
Your argument in #15 (comment) on audiobooks and comics is compelling, but I am not sure there is a consensus. I am still more in favour, instinctively, on making a title a must, but I won't lie down the road over this:-) B.t.w., I think the consensus was in direction of the title in the first primary resource. If that is an SVG file, it may have a title...
I see your point for comics; I cannot really decide. (Although I am not sure I see how comics could be enjoyed without an order but, I must admit, I never read comics...)
We must be careful about that one. The language is for the content, but not only. It may be important for any type of metadata that we add. But, thinking it further: I may be fine with a SHOULD for the reason you cite. (Although I would say it is a very strong SHOULD: if there is any kind of textual metadata, including a title, then it is MUST, with a default fallback probably on "en".) |
About the title, I'd like to add that if we want a WP to be accessible by spec, it is a MUST. |
What is the necessity of a title in the manifest for accessibility? I haven't seen that answered yet, and it's an important question. There are things it can help with, like search discovery, that I think will push people to add them, but we haven't defined what a user agent is supposed to do with a manifest. Until then, it's hard to give an objective opinion about the imperative for a title. For example, a title is necessary for an HTML page because it's what is announced when the page is loaded, and what is displayed as a title for the tab/browser. It's also important for a packaged EPUB to have one in the package document because it's the primary source of information for the user. And I'd argue it's necessary for a PWP for the same reason. But who has access to the manifest on the open web? When would an AT ever interact directly with the manifest? It will only have access through the user agent. Is the user agent expected to expose the name of the publication? If it does expose a title, when and how does that happen? Doesn't the publication title need to be available regardless of support for manifests? (i.e., is the manifest more like a hidden enhancement layer?) If we have an answer that makes a title essential, then making titles required is a no-brainer. Otherwise, we're straying into the territory of mandating metadata for the sake of metadata, or because we think it might be useful.
That boat sailed when we didn't put any restrictions on the content of the web publication. |
I disagree with the 2nd MUST. If there are legitimated use cases that are not based on HTTP (I'm not sure about that btw, mostly because of the Web security model, but it's another debate), the locator could be defined explicitly, but I don't see why it MUST be by default, when the implicit definition is fine in the vast majority of cases. |
I have created #22 to discuss which resources should be listed in manifest with regard to offlining Web Publications |
We need such a locator to be explicit for the same reasons that Atom has an explicit "self" link. The manifest will be cached/stored/shared in contexts where this information is not available implicitly. This has nothing to do with the Web security model and doesn't affect it either. PWP itself could be a good example of that. If we adopt a ZIP based container for EPUB 4, an explicit locator for the manifest will be the only way to get that info. |
In reply to @iherman:
Without a default identifier, UAs will define their own in an inconsistent way. I know that a locator (URL) is not necessarily immutable, but it's still better than having no default at all and having references that are not valid across UAs. |
I am not really sure what you mean. Without an identifier we have… what we have today with many documents on the Web. Things are indeed messy… so we have to convince publishers in the genera sense (remember the best practices document proposal) to use immutable and stable identifier. If an identifier is not immutable, it is not an identifier as far as I am concerned. |
@iherman without an identifier, how do you expect that UAs will reference a specific WP? (FYI, I'm not arguing for making an explicit identifier a requirement, I don't think that this would be reasonable) |
… which, if I understand correctly, is only a SHOULD in Atom.
Nothing would prevent a UA to add this info to the manifest at caching/storing/sharing time. To clarify: I'm just saying that I still don't see a compelling reasons why the link must be explicit. |
@rdeltour it is a SHOULD in Atom but we have more use cases (offline, packaged) where this is relevant.
Yikes. I know that a proxy can change content, but this shouldn't be a requirement. |
There is a locator, which is a MUST (well, it is there in any case, because the WP is on the Web!). Just like the Web today. And in a sloppy case we know that these locators change because, say, publishers move things around on their website. Users hit this problem all the time... |
@iherman so we almost agree. We both believe that:
The source of our disagreement is that I believe that we should explicitly indicate the locator as a fallback while you'd rather not. I think that we can keep that question open for now and revisit later. |
I wrote this on the title issue (#20) but cross-posting as it seems relevant to the manifest issue in general:
|
Correct.
We can have a mechanism to indicate that the locator is a fallback of some sort if we need it; my major objection is that this fallback must not be considered as an identifier (at least not automatically). @deborahgu had the comment yesterday on the call that a locator is like a URL and the identifier is like an ISBN. She was absolutely right with this comparison. Assigning an identifier to a publication is not only a technical step; it is some sort of a social commitment. It says "this value will not be changed, it will always identify this publication no matter what". It can be used in the formal catalogues of national libraries and archives, in bookmarks that you intend to use 10 years down the line, etc. In some cases, a locator is also an identifier. https://www.w3.org/TR/annotation-model/ is an identifier for the Annotation Model standard, but only because W3C (and MIT) has a public pledge that the "/TR space" of W3C will always be there unchanged, with all its content. (If W3C goes down the drain, the pledge of MIT is to keep it up nevertheless. If MIT goes down the drain… well, that would be the end of the World as we know it:-). But I think we can both agree that there will be many WP-s out there without a similar pledge, and we should not give the impression that it is there. |
In reply to @iherman
Your proposal definitely works for me, I think it's a good compromise. In reply to @baldurbjarnason
Overall I think that we have a perfectly reasonable list of requirements so far, but I agree that in general we should be careful. So far, the consensus is a single explicit requirement in the manifest:
There are other requirements but TBD or implicit at this point:
|
Addressed by #51 |
See telco discussion on closure. |
Ignoring issues such as location, serialization, etc. What is the minimum viable manifest?
I have extracted requirements from #6
For more detail see
#6 (comment)
#6 (comment)
#6 (comment)
#6 (comment)
The text was updated successfully, but these errors were encountered: