Add markdown figure filters. #187

argent0 · 2021-07-29T22:03:38Z

This PR requires a Figure constructor in pandoc's AST.

The code for a pandoc fork that has such constructor can be found here.

Details

This filter provides two syntaxs to represent figures in markdown.

Explicit syntax

The explicit syntax is constructed using a div with "figure" class. The
caption is also specified using a div but with a "caption" class.

Here is an example.

::: { .figure }

content.

:::: {.caption }
caption
::::

:::

All elements inside the figure that are an image without a caption in its own
paragraph become html's img tags.

Here is an example of figure containing two images and a caption.

::: { .figure }

![](test/media/rId25.jpg "")

![](test/media/rId25.jpg "")

:::: {.caption }
caption
::::

:::

This will result in a single figure containing multiple images.

$ pandoc -f markdown -t native --lua-filter=md-figure-explicit.lua fig-explicit.md

[Figure ("",[],[]) (Caption (Just []) [Para [Str "caption"]])
	[ Plain [Image ("",[],[]) [] ("test/media/rId25.jpg","")]
	, Plain [Image ("",[],[]) [] ("test/media/rId25.jpg","")]]]

<figure>
<img src="test/media/rId25.jpg" />
<img src="test/media/rId25.jpg" />
<figcaption><p>caption</p></figcaption>
</figure>

This will result in a single figure containing multiple images.

Implicit syntax

The second syntax uses the last paragraph inside the figure as the caption.

::: { .figure }


![](test/media/rId25.jpg "")

![](test/media/rId25.jpg "")

This is a caption with
multiple lines

:::

This results in the following output:

$ pandoc -f markdown -t native --lua-filter=md-figure-implicit.lua fig-implict.md
[Figure ("",[],[])
	(Caption
		(Just [])
		[ Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "caption",Space,Str "with",SoftBreak,Str "multiple",Space,Str "lines"]]) 
	[Plain [Image ("",[],[]) [] ("test/media/rId25.jpg","")],Plain [Image ("",[],[]) [] ("test/media/rId25.jpg","")]]]

<figure>
<img src="test/media/rId25.jpg" />
<img src="test/media/rId25.jpg" />
<figcaption><p>This is a caption with multiple lines</p></figcaption>
</figure>

Sample Firefox's HTML rendering

For the implicit syntax example, this is firefox's render.

alerque · 2021-07-31T08:47:04Z

Very interesting indeed. Ironically this (at least the implicit syntax) is almost exactly the same input markdown I conjured up for a book project just 3 days ago. I didn't prefix the class name with a dot but that's the only difference in input. Since the details of my production workflow are completely different and I don't want to distract too much from this issue I'll hide them away here, the curious can click for details.

Sample implementation typesetting figures from Markdown in SILE

For my use case I didn't use a filter at all, but I am using the SILE writer in my Pandoc fork. This writer just takes the div syntax and outputs block wrappers based on the classes, so basically what I get in SILE is a Div content block with a class attribute of figure. Then from the Lua side I can easily handle the nested image and the remaining content as the caption. For example:

Markdown input:

::: figure
![Grossmünster Katedrali](resimler/grossmunster.jpg)

*Huldrych Zwingli’nin 1519–1531 yılları arasında vaaz verdiği İsviçre Zürih’teki Grossmünster Katedrali.*
:::

Gets converted to SIL format thus:

\begin[classes="figure"]{Div}
\img[src=resimler/grossmunster.jpg,title=fig:]{Grossmünster Katedrali}

\Emph{Huldrych Zwingli’nin 1519–1531 yılları arasında vaaz verdiği İsviçre Zürih’teki Grossmünster Katedrali.}
\end{Div}

To typeset this I have a Lua function in the project for the figure class that makes assumptions about how to layout the figures for that book. It overrides the image function to make the images the full frame width and centers everything on the page:

SILE.registerCommand("class:figure", function (options, content)
  local old_img = SILE.Commands["img"]
  SILE.registerCommand("img", function (options, content)
    options.width = "100%fw"
    old_img(options, content)
    SILE.call("skip", { height = "1en" })
  end)
  SILE.call("open-double-page", { double = true, odd = false })
  SILE.call("topfill")
  SILE.call("vfill")
  SILE.call("center", {}, function ()
    SILE.process(content)
  end)
  SILE.Commands["img"] = old_img
end)

The finished result is this page:

Obviously my implementation is just overloading the block syntax without introducing a new AST element. This works because of the flexibility I have on the typesetter side but may or may not work well for all output formats. There are pros and cons to overloading an existing object and giving it "magic" smarts vs. having a dedicated content type.

Back to your filter (and your Pandoc fork). I'm not sure we want to merge anything here that doesn't work out of the box with Pandoc, but I'm very interested in seeing something worked out that makes this easier on everybody. I'd be happy to play along with other implementations in the name of keeping things standardized and hence inter-operable if possible.

What are your thoughts on the Pandoc fork? Has the idea of a new AST object for this been brought up yet?

argent0 · 2021-08-02T12:56:59Z

Yes, the idea for a new AST object is being discussed since 2016. There have been concrete proposals before our work, that I've merged into my fork.

We are working on improving pandoc's figure support and keep public online discussions . We mainly focus on HTML, LaTeX and Markdown.

So my fork includes the Figure AST element. And most output formats can handle it. On the input side we've decided to go with this method for markdown (which doesn't have support for floats yet) and I'm working on the HTML input (the <figure> tag, which is currently handled in a limited fashion).

<figure class="important">
<ul> <li> Delete me? </li> </ul>
<figcaption> CAP2 </figcaption>
</figure>

Pandoc 2.14

$ pandoc -f html -t native test/figures/figure-simple.html
[BulletList
 [[Plain [Str "Delete",Space,Str "me?"]]]
,Para [Str "CAP2"]]

$ pandoc -f html -t html test/figures/figure-simple.html
<ul>
<li>Delete me?</li>
</ul>
<p>CAP2</p>

My Fork

$ pandoc-fork -f html+native_figures -t native test/figures/figure-simple.html
[Figure ("",["important"],[]) (Caption Nothing [Plain [Str "CAP2"]]) [BulletList [[Plain [Str "Delete",Space,Str "me?"]]]]]

$ pandoc-fork -f html+native_figures -t html test/figures/figure-simple.html
<figure class="important">
<ul>
<li>Delete me?</li>
</ul>
<figcaption>CAP2</figcaption>
</figure>

Thoughts

An ad hoc Figure AST element (which should be understood as a float) will improve consistency and flexibility in many output formats.

argent0 force-pushed the markdown-figures branch from b0a6556 to 50cbeba Compare July 30, 2021 13:45

Add markdown figure filters.

50cbeba

argent0 marked this pull request as draft August 2, 2021 12:27

argent0 mentioned this pull request Aug 19, 2021

Use the new 'simpleFigure' builder function in the readers. jgm/pandoc#7364

Closed

33 tasks

argent0 marked this pull request as ready for review August 19, 2021 20:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add markdown figure filters. #187

Add markdown figure filters. #187

argent0 commented Jul 29, 2021

alerque commented Jul 31, 2021 •

edited

Loading

argent0 commented Aug 2, 2021 •

edited

Loading

Add markdown figure filters. #187

Are you sure you want to change the base?

Add markdown figure filters. #187

Conversation

argent0 commented Jul 29, 2021

Details

Explicit syntax

Implicit syntax

Sample Firefox's HTML rendering

alerque commented Jul 31, 2021 • edited Loading

argent0 commented Aug 2, 2021 • edited Loading

alerque commented Jul 31, 2021 •

edited

Loading

argent0 commented Aug 2, 2021 •

edited

Loading