Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SILE format support #6087

Open
7 tasks
alerque opened this issue Jan 27, 2020 · 4 comments
Open
7 tasks

Add SILE format support #6087

alerque opened this issue Jan 27, 2020 · 4 comments

Comments

@alerque
Copy link
Contributor

alerque commented Jan 27, 2020

Back in 2015 I cloned Pandoc and started hacking on a fork to support The SILE Typesetter. It's now 2020 and this issue is to track the overall progress on that effort with the goal of getting it upstreamed into Pandoc. I'm hesitant to drop links here because the only thing most people are going to see is that I'm a terrible Haskell hack. I'm pretty sure the commit history will reveal that I'm not much better than a bunch of circus monkeys pushing random buttons hoping something works. But lets face it, this is never going to get contributed if I don't ⓐ get some help and ⓑ am not motivated by the exposure.

SILE support has taken several shapes in my hacking, and long term I think it should take on one or two more. I initially started with a copy of the LaTeX writer. As time has gone on I've been systematically stripping things from it because SILE is fundamentally simpler than LaTeX. I almost wish now I'd started with a blank slate ond built it one rule at a time—and maybe that's still the way to get this contribute. I could use somebody to hold my hand through the process!

I've had this working in full scale production since 2016. I now have 3 separate publishing companies using it as their exclusive book publishing workflow, keeping full length book projects in Markdown and using Pandoc to convert them for SILE to typeset to go to press. Yes it is a bit of a mess, but the proof of concept has far outlived the 'concept only' stage. The initial work started based on Pandoc 1.15, the current work is rebased onto master and works with 2.9.1.

Currently only the Writer works in any semblance of working condition. I hacked on a Reader but that's going to be much more complicated than the Writer, and I have no production use for it. As far as I'm concerned a writer with no reader is enough to get started. I'm sure someday a Reader will help somebody, but I don't want it to hold up getting the Writer upstreamed.

To get started:

  • Sile TeX-like Writer
  • Raw Sile support in other formats
  • Documentation
  • Tests

Later:

  • Direct PDF output workflow
  • Sile XML Writer
  • Sile Reader(s)

SILE supports two input formats, XML and a TeX-like format that is much more human writable. Eventually it would be nice to support both, but I've been concentrated on the TeX-like syntax. While bearing a lot of resemblance to actual TeX, the format is a lot more consistent and flexible. It is not a circus full of magic ponies, and it is not a Turing complete language—except that you can embed Lua code, so it has that going for it.

Unlike TeX:

  • Unicode input is expected, there are no fancy substitutions or character encoding issues to muck around with. A copyright symbol is inserted with not \copy, --- is three hyphens not an em-dash, which would be inserted as an actual .
  • Only 4 characters are special: \, {, }, and %. All can be escaped in input with a slash.
  • All commands can use 'environment' or 'command' syntax interchangeably. \begin{font}foo\end{font} and \font{foo} are the same.
  • All commands use the same syntax. None of them have monkey business like extra content groupings. All of them receive arguments the same way. \command[key=val,key="val,with,commas"]{content}. Both the options block and the content are optional: \font, \font[], \font{}, and \font[]{} are always acceptable syntax variants.
  • There is no preamble. Anything can be set anywhere as long as it is set before use. Packages can be loaded at any time as long as they are loaded before any commands defined by them are used.

In the last 5 years I've also gotten deeply involved in SILE development (and my Lua skills are definitely better than my Haskell ones). Eventually it dawned on me that rather than trying to teach Pandoc a bunch of fancy work-arounds for data types SILE didn't know much about it would be a lot easier to build first class support for everything Pandoc knows about into SILE. This creates a bit of a cart-horse problem in that both sides need to coordinate and compatibility needs to match. The released version of SILE has a package available to cover the current state of the Pandoc writer. I can keep iterating on that to support whatever final form gets officially upstreamed, but I will aim to have the support released in SILE before the Pandoc version comes out.

As an example, let's take HorizontalRules:

$ pandoc -t latex <<< '----'
\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

That's a bunch of presentation specific style hard coded into the output. Sile does have a \rule command that could be used in a similar way. This is valid SILE markup:

\center{\hrule[height=0.5pt,width=50%lw]}

A slightly fancier version is available that raises the line to the middle of the line and has a default height of 0.5pt, but full width, so this would work too:

\center{\fullrull[width=50%lw]}

But even that includes some hard coded presentation information so instead I've chosen to add a command to the pandoc package. While not important necessarily the definition looks like this:

SILE.registerCommand("HorizontalRule", function (options, _)
  SILE.call("raise", { height = options.raise or "0.8ex" }, function ()
    SILE.call("center", {}, function ()
      SILE.call("hrule", {
          height = options.height or "0.5pt",
          width = options.width or "50%lw"
        })
      end)
    end)
  end)

In practice this means Pandoc's output can look a lot like it's own internal AST:

$ pandoc -t sile <<< '----'
\HorizontalRule

A user could conceivable choose to style this differently (say, something other than 50% of the line width) by including their own restyled command without touching Pandoc's output.

@kiufta
Copy link

kiufta commented Dec 1, 2022

@alerque May I link to the repo now?

@alerque
Copy link
Contributor Author

alerque commented Dec 1, 2022

@kiufta My fork has lots of branches with my modifications rebased against various Pandoc releases.

@Pi-Cla
Copy link

Pi-Cla commented Mar 26, 2024

Hi @alerque, I am interested in tackling this issue. So I am wondering what still needs to get done to add SILE to pandoc. (and if we should still be working with the code that is already there vs starting fresh)

@alerque
Copy link
Contributor Author

alerque commented Mar 26, 2024

@Pi-Cla I'd be super excited about helping as best I can. I've been learning far more Rust than Haskell though so my ability to help is limited. I still have the basics working. I've rebased my writer branch on top of almost every Pandoc release for the last 8 years and it still gets the job done. That being said the writer is not feature complete, it only correctly handles a subset of the AST elements.

It would not be hand for me to help with how the AST elements should be output, but I ran into some issues understanding the Haskell code for getting there.

If you're up for it I suggest you start a branch based on one of my recent ones and maybe diff it with the latex or another writing and start figuring out things I've done wrong or left incomplete. I would commit early and commit often and I can help rebase it all later for a clean mergable history. I don't know I'd make math support a target for the first release or not, maybe we can ask the powers that be here if that would be a requirement. Given the existing math format support it shouldn't be too hard to adapt a similar one, but I don't expect it to be trivial either. If we can get just the standard AST elements worked out that would be amazing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants