Calculating the Publication.positionList #151
Replies: 21 comments
-
After today's call, the group consensus is:
|
Beta Was this translation helpful? Give feedback.
-
I implemented the solution for Swift: This adds two properties to
Regarding the EPUB implementation, I used a size of 3500 bytes for splitting a resource into a number of positions. This amounts to about one page of text on an iPad. Of course, this value needs to be the same on every platform, so we might need to experiment a bit to find the sweet spot. |
Beta Was this translation helpful? Give feedback.
-
Note that the size of a "page" for this calculation should better be compatible with the notion of "printed page" in LCP. The LCP spec contains: "For the print right, a page is defined as follows: ... 1024 Unicode characters (not bytes). Two solutions:
Note also that if a page-list is provided with an EPUB file, Publication.positionList should IMO reflect it, with no need for computed values. |
Beta Was this translation helpful? Give feedback.
-
Since we already agreed that the Retrieving the current position in the web view is particularly difficult for dynamic books, because the rendered DOM might not be the same as the one in the
I agree that it would be better to reflect the actual There're some technical difficulties in retrieving the current position in the web view if we would use |
Beta Was this translation helpful? Give feedback.
-
Is there any discussion or specification on how to calculate the |
Beta Was this translation helpful? Give feedback.
-
I don't think that it makes sense to calculate positions for audiobooks. We can use the temporal media fragments and |
Beta Was this translation helpful? Give feedback.
-
I we agree on 1024 bytes as the distance between two Please add a thumb up if you agree with 1024 bytes. |
Beta Was this translation helpful? Give feedback.
-
It seems we all agree that positions are defined as a an approximation of the notion of page, when this notion is not expressed clearly in an ebook. Please comment if you disagree.
I didn't think about this one. But do we need to compute a current |
Beta Was this translation helpful? Give feedback.
-
We've already agreed on this during a call but I've added a 👍anyway.
I disagree about that statement, I think that in every EPUB we absolutely need to compute a position list. A page-list is not the equivalent of a position list:
|
Beta Was this translation helpful? Give feedback.
-
IMO not providing a
While I agree that it makes less sense for audiobooks, I think it's still worthwhile to generate a |
Beta Was this translation helpful? Give feedback.
-
that is right.
Reading this part, I'm wondering why page lists would be explicitly added by publishers if they don't allow users to share and reference them in a proper way. If there is a notion of position list on one side and a notion of page list on the other side, I don't see how we can design a good UX in reading apps. We are told that exposing page lists like we expose ToC is bad, and that they should be accessed via "go to" actions ... like position lists. |
Beta Was this translation helpful? Give feedback.
-
With strings, the only way we can expose a page-list is like a ToC. You can't build an affordance with a text field, that would be a usability nightmare. |
Beta Was this translation helpful? Give feedback.
-
Here are the PRs adding |
Beta Was this translation helpful? Give feedback.
-
A best practice should be discussed among app developers. In Thorium, we have planned a "go to page x" affordance and need to map it to the proper locator. If there is a page-list in the publication, this seems the proper list to use. If there is none, the position-list seems to be the proper fallback. Having a "go to" plugged to positions and an additional ugly screen of page numbers is a bad solution. |
Beta Was this translation helpful? Give feedback.
-
Which type of field do you plan on using for that affordance ? |
Beta Was this translation helpful? Give feedback.
-
The current "go to page" affordance in Thorium is a simple text field where users type-in an arbitrary single-line string of characters. This is sufficient to meet accessibility requirements for the classroom scenario: "teacher asks students to open page '45' (or 'IX' in Roman numerals) in their printed publications, or in the digital equivalent provided by the EPUB3 The notion of "position" discussed here (i.e. fragmentation of publication resources in the |
Beta Was this translation helpful? Give feedback.
-
Which affordance should be associated with such data then, if any? |
Beta Was this translation helpful? Give feedback.
-
A similar one where the field doesn't accept a string but simply an integer. You can also add +/- buttons or a SeekBar. |
Beta Was this translation helpful? Give feedback.
-
As we may see from Apple Books - it calculates page numbers dynamically in background for every Font / Font Size change, on screen rotation. As I see we can do a background calculation too. What we need to do for calculation:
Not so difficult to get total number of pages for current display settings, but now we have to match the virtual page with the one in a book - form list on locators. I think we may do in on step 2 for each page just get current locator and for some kind of Dict [Locator: real_page_number] or just Array of some structs. Book outline uses the same mechanism I think, as It calculates pages in a real time when I changed settings and open Table of Contents for a large book. Any Idea how else can we match virtual pages to the precise location in Publication? |
Beta Was this translation helpful? Give feedback.
-
@emartinson the Apple Books approach is by far the worst that I've seen on the market:
Implementers can add this type of calculation with their own take on a position list, but I would not recommend going in that direction. |
Beta Was this translation helpful? Give feedback.
-
cc @llemeurfr you can read this whole thread again, but as you can see I was already warning you about position list vs page-list and the UX issues tied with page-list all the way back in February 2020. |
Beta Was this translation helpful? Give feedback.
-
We need to specify for each format:
positionList
?positionList
?Related issue: Total progression in a publication for locators
CBZ and PDF
Those formats are straightforward, we can read directly the number of pages for PDF and files for CBZ to build the
positionList
. To retrieve the current position, we just need the index of the page.PDF can be a bit less efficient because we need to open the file (potentially load it entirely in memory, eg. with Swift) to read its number of pages.
LCPDF
LCPDF contains encrypted PDF. So we can't really get the
positionList
until the license is unlocked. It might also contain several PDF, which is not very efficient if we have to open all of them to calculate thepositionList
.An alternative would be to have the number of pages as a link property for each resource in the RWPM, then it's really efficient to build the
positionList
and doesn't require the publication's passphrase.The
positionList
is built by adding the number of pages of each PDF in the readingOrder. Here's an example implementation in Swift: https://github.com/readium/r2-navigator-swift/blob/839e0c4900a84b9e337e7a3d836f0b78c7d9c28b/r2-navigator-swift/PDF/PDFNavigatorViewController.swift#L50We can find out the current position easily by keeping a separate array of positions for each resource
href
, and using the page index of the currently visible resource (eg. https://github.com/readium/r2-navigator-swift/blob/839e0c4900a84b9e337e7a3d836f0b78c7d9c28b/r2-navigator-swift/PDF/PDFNavigatorViewController.swift#L221).EPUB
The tricky part that needs to be discussed...
How to create the
positionList
?Among the solutions discussed to split a resource into pages:
positionList
might be different. Not a good solution IMHO.Both the characters and bytes methods are pretty reliable to express the relative size between reading order resources and publications, as long as the chapters are not image based.
How to find out reliably the currently displayed position from
positionList
?I think we agreed on a call that there's no way to accurately find the current position in an EPUB. The DOM displayed in a web view is dynamic and might not be equivalent to the one parsed from the static XHTML files. We can however approximate it:
progression
has been a pretty reliable way to position a page in a web view, and it could work well across platforms here too. It's not such a problem if we don't match the exact position that was split arbitrarily (bytes or characters), as long as we are reliably imprecise across devices. We need to end up at the same page when sharing aposition
index between devices. On the plus side, this is easy to implement to make some test quickly.Fixed layout vs reflowable
There's the added difficulty that an EPUB can contain both fixed layout and reflowable resources. Fixed layout is straightforward, one resource = one page. But we need to take it into account when calculating the
positionList
instead of only splitting by characters/bytes.Side discussion
Calculating the
positionList
might be slow and memory/CPU-intensive (eg. for LCPDF we have to load all the PDFs in memory). I don't think that it's necessary to expose an asynchronous API forPublication.positionList
. The caller can wrap it in a background process if it doesn't need thepositionList
synchronously.However, we could benefit from having a cache in the streamer to store the calculated
positionList
(eg. as JSON).positionList
. For information, we don't expose the release identifier in Publication, but it can be retrieved privately directly in the streamer for EPUB.Beta Was this translation helpful? Give feedback.
All reactions