Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize command vs. environment whitespace handling #2041

Merged
merged 5 commits into from
Jun 6, 2024

Conversation

alerque
Copy link
Member

@alerque alerque commented Jun 3, 2024

Closes #105

@alerque alerque added this to the v0.15.0 milestone Jun 3, 2024
… commands

BREAKING CHANGE: Input documents using the SIL language will now retain
whitespace more consistently. Whitespace following environment blocks is
no longer swallowed in differently than space following command syntax.
Consecutive line breaks in the input will consistently trigger new
paragraphs no matter what they follow.

Note that this change cannot be patched over via the retrograde package
settings because by the time your document could specify what packages
to load or settings to set, the input document has already been parsed.
To achieve the same rendering results where environments could be ended
leaving any amount of blank lines and still joined to the following
content as part of the same paragraphs, you will need to remove the
extranious whitespace.
@@ -44,10 +44,10 @@ local styles = {
enumerate = {
{ display = "arabic", after = "." },
{ display = "roman", after = "." },
{ display = "alpha", after = ")" },
{ display = "alpha", after = "." },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get what these various changes on the lists have to do with the issue "normalizing command vs. whitespace handling"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't, they should be broken out into another commit, but I was going to ask you about them. The sequence didn't make any sense to me. I already removed the arbitrary limit of 6 levels and set it to just repeat from the set, but the 2-1-2-1 alternation of trailing characters didn't make sense to me. Either ...))) or .).).) would make sense, but where did ..))). come from? I looked around at some word processors and such and couldn't find a precedent for that. Was this something that came out of TeX?

I'm not set on changing it, but as long as all the alignment is getting overhauled I thought it would be a better time than most to make it more logical.

Copy link
Member

@Omikhleia Omikhleia Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked quickly and actually I don't really care as far as concerned: whether using Markdown or Djot, these define the list separators (dots, parens, etc.) based on their own input syntax, so don't use the default styling. I don't really care for my "resilient" lists either, since styles can be overridden at the users convenience1. In other terms, I don't have any strong opinion here on what SILE's enumeration list should use by default.

Footnotes

  1. That is, the default styles currently reflect the same default behavior as SILE's list before this change, but since styling there is an orthogonal issue, I don't think end users are impacted.

@alerque
Copy link
Member Author

alerque commented Jun 5, 2024

Hey @Omikhleia do you have a second to sanity check me on this? Here is the TL;DR version of what's in this PR:

I set out to fix the lingering weirdness in SIL inputs where whitespace was being swallowed unexpectedly. It turned out to be a lot simpler to fix than I expected but it is a breaking change.

This was being processed as two paragraphs:

\thing{}

foo

While this was being processed as one because all environment blocks were swallowing whitespace around them:

\begin{thing}\end{thing}

foo

I just got rid of the whitespace gobbling greediness so now the input spaces matter. Environments don't need to be followed by an explicit \par if there is a blank line creating a paragraph break anyway.

Several of my book projects blew up with this change, but the new behaviour makes so much more sense to me I'm okay with fixing them. Surprisingly the only thing that broke in our test suite was the lists package.

So it turns out that package was always outputting paragraph breaks before and after the command no matter how it was called. This was 3 paragraphs:

foo\enumerate{}bar

This was also 3 paragraphs:

foo

\begin{enumerate}
\end{enumerate}

bar

The latter looked sane being 3 paragraphs, but that was only because the input environment syntax was swallowing the natural looking ones and new ones were being pushed out. After the witespace handling change it became 4 paragraphs (the leading one wasn't being duplicated but two were being output following the list).

Because the way inter-list-item spacing was being handled by overloading document.parskip this was pretty involved to fix. In ended up scrapping all the vertical space handling and re-implementing it leaving the decision for the surroundings to be new paragraphs or not up to the input document and using the lists.parskip only between list items/lists.

The new behavior means parskip is used before and after lists only when the user actually puts the list in its own paragraph, otherwise it flows following the normal leading rules for line spacing. The continuation paragraph is even not indented, while a clean new paragraph afterwards is.

The new behavior is a lot more in line with what I would expect as a consumer, but I'd like to hear your thoughts on the design.

Note also the retrograde package could be expanded to restore some of the old list characteristics, but given that the input language is one thing we can't easily shim and that directly effects how the spacing rules are applied I thought it best not to try to patch over the differences at all.

@Omikhleia
Copy link
Member

The new behavior is a lot more in line with what I would expect as a consumer, but I'd like to hear your thoughts on the design.

I don't use SIL input nowadays, but I'll try to see how these list changes affect (or not) other input languages. It would certainly be best if packages can do their logic (and inserting a "par" is not necessarily wrong!) regardless on how the inputter handles (or fails at handling) paragraphs.

@alerque
Copy link
Member Author

alerque commented Jun 5, 2024

If you are not combining it with SIL input all the relevant changes are in the list package init file. Copying just the packages/lists/init.lua file from this branch into another context should net you most of the relevant bits, but the base class tweak recently (#2043) actually does effect this, so you'll want to also grab classes/base.lua

@Omikhleia
Copy link
Member

so you'll want to also grab classes/base.lua

Wow, thanks for notifying that change to the base class.

Most of my usual workflows don't use SIL, and don't use the regular "lists" package (I'm using resilient's lists with styling); but this base class change might affect things (maybe).
I guess I can check it separately (on its own) independently in my usual workflows... Hmm... although these don't even use the regular base class either, but the "upgraded" version shimmed by silex.sile. In other terms, I am likely not affected in current "complete" workflows (i.e. only the "fair-play" workflows not using "resilient" would perhaps be affected), until I resync silex.sile that is.

@Omikhleia
Copy link
Member

To recap my own potential actions:

  • Port the base class change to silex.sile (base class overrides) and perform a full resilient non-regression testing (all other things equal) to check whether it may affect something.
  • Perform some Markdown/Djot testing without using resilient, to check if the (fallback to) the default lists package and base class affect something.

On my current prioritization schedule, a possible ETA would be early August (in the best case scenario), is it too late?

… lists

BREAKING CHANGE: Lists now respect the input document spacing and normal
settings with regard to paragraphs breaks before, after, and inside
lists. This is place of overriding the paragraph skip settings to match
the list item spacing setting and always forcing paragraph breaks before
and after lists.
@alerque
Copy link
Member Author

alerque commented Jun 5, 2024

I've got ants in my pants to get v0.15 out the door. I've had to publish 2 whole books and a couple smaller projects using develop based builds, and I have more in the works. It's way better if I can tag them as being built with some stable release tag. Also I can't release a stable version of CaSILE until that happens since it depends on some of the upstreamed fixes.

I have a small break between other projects and am trying to get things worked out so it can go out the door and I can move on. That includes other changes I need to work on in SILE (like the language code stuff) that is hard to touch without overhauling yet more stuff, and this release has grown too much already. The release notes are going no need a spiral binding already.

If you don't have the time I understand, I'm not trying to pressure you, just letting you know I'm under pressure myself and planning on shipping this sooner rather than later.

@Omikhleia
Copy link
Member

Omikhleia commented Jun 6, 2024

... just letting you know I'm under pressure myself ...

Same here, actually. Besides "real life"™ stuff, I've been fairly busy on my next SILE-based book from early December to end of May.
At last, I'm now just awaiting the proof copy to be received (likely mid-June) and checked, before announcing it in due form (well expecting there's no print issue or other last minute adjustments to be made...), and then I'll have to complete all the administrative tasks (press & contributor copies, legal deposit finalization, announcements to appropriate parties, etc.).
On the way, it required a release of all my own stuff (and thus, I released silex.sile 0.5 on 29/02, ptable.sile 3.0 on 05/03, markdown.sile 2.0 on 16/03 and resilient.sile 2.3 on 18/03, all in a rush) and considerations on which version of SILE to use and whether to upgrade my tooling or not...
It turns out the book is still made with my old "custom" build of SILE 0.14.11 + shims + PR 1792 (automated italic correction), which I deem unfortunate, and I'd have preferred using an official build too, but time was running really short.

  • I tried some SILE 0.15-develop with nix flakes, but it caused havoc which I couldn't yet investigate...
  • (I know it kinda works with SILE 0.14.17 but I even lacked time to rebuilt it with a port of PR 1792, and didn't need any of the core changes beyond that, so didn't bother further eventually).

But that's life. I'm happy, though, that I could still report a few issues here and also push a few small PRs :)

I am also committed to typesetting and distributing a bibliography which I'd ideally prefer composing with SILE (hence my recent reports and investigations on the topic)...

Anyway, I'll to see if I can advance my above ETA, but it's hard for me to commit on anything in June and July at this point.

I've had to publish 2 whole books and a couple smaller projects

Hey, that's cool news. I hope you'll consider adding a reference to them to the Wiki -- If you didn't see it yet, I also did some homework there ;)

@alerque alerque merged commit cc104ac into sile-typesetter:develop Jun 6, 2024
19 checks passed