Whitespace is swallowed after italic #14

lyderic · 2017-12-30T22:11:46Z

Hi,

kepub files produced with kepubify "swallow" whitespaces after words in italic.

For example, I used it with the French epub file found here:
https://www.ebooksgratuits.com/newsendbook.php?id=2610&format=epub

In the "Préface" (foreword), whenever the italic word Belgica is used, the following word becomes glued right next to it, without whitespace, for example:

...l'exploration de la Belgicaprécède...

instead of:

...l'exploration de la Belgica précède...

I am using version 1.3.5 under Linux. I have also build the current "dev" version from source, it didn't help.

Thanks for this fantastic tool!

Cheers,
L.

pgaskin · 2018-01-03T03:25:59Z

Thanks for reporting this issue! I have looked, and can confirm the issue.

I'll fix it sometime this week.

pgaskin · 2018-01-05T22:40:46Z

This should be fixed in release v1.3.6 now.

lyderic · 2018-01-07T09:43:53Z

Many thanks. I confirm that it is fixed for the epub I reported. It should be the case for all other epubs from this source. If not, I will allow myself to let you know when a problem pops up. Cheers, Lyderic

…

On Fri, Jan 5, 2018 at 10:40 PM, Patrick G ***@***.***> wrote: This should be fixed now. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#14 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AE01db4-RgnD1WeYahyuX8VblcMc7tNrks5tHqTugaJpZM4RPtX_> .

- Improved robustness - More is implemented directly in the HTML parser and renderer (see my fork of x/net/html) - Better support for XHTML and HTML5 (rather than using a bunch of workarounds) - No more regexps for modifying HTML - Better smart punctuation - More punctuation supported - More robust (won't apply to everything unconditionally) - Now off by default - Faster and more efficient (15-30% faster, 50-70% less memory) - Less memory allocations and copies due to use of readers and writers rather than storing rhe entire file in memory multiple times - Stack-based span adding algorithm (rather than recursive, which has more runtime and memory overhead) - Use byte arrays or runes rather than strings where possible - Better parallel processing of content files - Eliminated memory, goroutine, and file descriptor leaks - Cleaner and better code - Easier to extend - More stable API - More complete unit tests - More accurate sentence splitting and segment numbering (checked against 3 recent free books) - Better match Kobo's behavior by preserving, but not wrapping (in a koboSpan) TextNodes with only whitespace. Previous versions of kepubify used to collapse it to a single space, which still works, but is less efficient to do and is slightly different than what Kobo does (although it results in the same thing during rendering). - Fixed some edge cases where the segment counter could be incorrectly incremented. - Also increment paragraph counter for tables (this case was missing before). - Don't increment paragraph counter if spans were added (i.e. an empty or only whitespace paragraph element) (this case was missing before). - Smaller binary size - Also run tests on Windows closes #47, fixes #45, fixes #35 better fix for #36, #29, #28, #26, #21, #14, #10, #5, and #2

pgaskin added the bug label Jan 3, 2018

pgaskin self-assigned this Jan 3, 2018

pgaskin closed this as completed in a4dea55 Jan 5, 2018

pgaskin mentioned this issue Mar 4, 2018

Elements only containing nbsps are removed #21

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whitespace is swallowed after italic #14

Whitespace is swallowed after italic #14

lyderic commented Dec 30, 2017 •

edited

Loading

pgaskin commented Jan 3, 2018

pgaskin commented Jan 5, 2018 •

edited

Loading

lyderic commented Jan 7, 2018 via email

Whitespace is swallowed after italic #14

Whitespace is swallowed after italic #14

Comments

lyderic commented Dec 30, 2017 • edited Loading

pgaskin commented Jan 3, 2018

pgaskin commented Jan 5, 2018 • edited Loading

lyderic commented Jan 7, 2018 via email

lyderic commented Dec 30, 2017 •

edited

Loading

pgaskin commented Jan 5, 2018 •

edited

Loading