Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for alt text as short title in latex #2447

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

waeltken
Copy link

@waeltken waeltken commented Oct 12, 2015

Use the images alt text title as a short caption for the list of figures in
latex documents.

![la lune](lalune.jpg "Voyage to the moon")

Should now result in:

\caption[Voyage to the moon]{la lune}

in latex figures.

@hadim
Copy link

hadim commented Oct 12, 2015

Nice catch @waeltken !

@waeltken
Copy link
Author

I think we need to update the documentation of the Images -> implicit_figures section in the readme.

@hadim
Copy link

hadim commented Oct 12, 2015

Let's see what @jgm first think about this feature.

@hadim
Copy link

hadim commented Oct 12, 2015

And maybe a test should ne nice too.

@waeltken
Copy link
Author

Yep, i just saw that the test for the basic writer fails, so that's not good.

@hadim
Copy link

hadim commented Oct 12, 2015

diff --git a/tests/writer.latex b/tests/writer.latex
index 506c21d..0f29353 100644
--- a/tests/writer.latex
+++ b/tests/writer.latex
@@ -938,7 +938,7 @@ From ``Voyage dans la Lune'' by Georges Melies (1902):
 \begin{figure}[htbp]
 \centering
 \includegraphics{lalune.jpg}
-\caption{lalune}
+\caption[Voyage dans la Lune]{lalune}
 \end{figure}

 Here is a movie \includegraphics{movie.jpg} icon.

That should do the job for the test.

@waeltken
Copy link
Author

Seems reasonable! Thanks! 😉

@waeltken
Copy link
Author

Well this actually works really nice for my thesis! So even if this get's rejected I am still very happy. 😄

@hadim
Copy link

hadim commented Oct 12, 2015

+1

@hadim
Copy link

hadim commented Oct 12, 2015

I have an issue with pandoc-crossref which I use for my thesis. This patch does not work with this plugin. See :

![My long caption](figure.png "Short caption"){#fig:figure-label}

produces

\begin{figure}[htbp]
\centering
\includegraphics{figure.png}
\caption[]{\label{fig:figure-label}My long caption}
\end{figure}

Instead of :

\begin{figure}[htbp]
\centering
\includegraphics{figure.png}
\caption[Short caption]{\label{fig:figure-label}My long caption}
\end{figure}

Looks like text short is empty after the plugin pandoc-crossref.

Any idea ?

@hadim
Copy link

hadim commented Oct 12, 2015

Seems to be a pandoc-crossref related issue : lierdakil/pandoc-crossref#38

@hadim
Copy link

hadim commented Oct 12, 2015

@waeltken Could you please apply this patch instead :

diff --git a/src/Text/Pandoc/Writers/LaTeX.hs b/src/Text/Pandoc/Writers/LaTeX.hs
index 0caa807..cbd6bd8 100644
--- a/src/Text/Pandoc/Writers/LaTeX.hs
+++ b/src/Text/Pandoc/Writers/LaTeX.hs
@@ -353,11 +353,14 @@ blockToLaTeX (Para [Image txt (src,'f':'i':'g':':':tit)]) = do
   inNote <- gets stInNote
   capt <- inlineListToLaTeX txt
   img <- inlineToLaTeX (Image txt (src,tit))
+  short <- stringToLaTeX TextString tit
   return $ if inNote
               -- can't have figures in notes
               then "\\begin{center}" $$ img $+$ capt $$ "\\end{center}"
               else "\\begin{figure}[htbp]" $$ "\\centering" $$ img $$
-                      ("\\caption" <> braces capt) $$ "\\end{figure}"
+                      if null short
+                        then ("\\caption" <> braces capt) $$ "\\end{figure}"
+                        else ("\\caption" <> brackets (text short) <>  braces capt) $$ "\\end{figure}"
 -- . . . indicates pause in beamer slides
 blockToLaTeX (Para [Str ".",Space,Str ".",Space,Str "."]) = do
   beamer <- writerBeamer `fmap` gets stOptions

It removes the [] if there is no alternative text in the image. Seems cleaner to me.

@waeltken waeltken force-pushed the short_caption_latex_figures branch from 3fd9140 to 67f7c41 Compare October 12, 2015 12:36
@hadim
Copy link

hadim commented Oct 12, 2015

Thank you !

@waeltken
Copy link
Author

I think it’s okay now, right?

Am 12.10.2015 um 14:36 schrieb Hadrien Mary notifications@github.com:

Thank you, but I have updated the code since I did a mistake with $$ "\end{figure}"


Reply to this email directly or view it on GitHub #2447 (comment).

@hadim
Copy link

hadim commented Oct 12, 2015

Yup

@hadim
Copy link

hadim commented Oct 12, 2015

A small issue I have. Text inside short caption are not rendered. For example ![This is a long caption.](http://fakeimg.pl/439x320/282828/ "This a _short caption_ (alt text)") does not render short caption in italic.

@waeltken
Copy link
Author

You are right, that should definitely work. What's your resulting output?

I assume that this line here:

short <- stringToLaTeX TextString tit

does not produce the required markup. But I don't know how to parse that part of the title correctly at the moment. I guess our maintainer John might know. 😉 I'll look in to it later this week.

You could try:

short <- stringToLaTeX CodeString tit

But right now i simply don't have the time and need for italics etc. in the short caption.

@waeltken
Copy link
Author

I think the travis test fails because the patch is not applied to the master branch but the latest release version. I've done a rebase to the master branch, but i don't know how github handles this if i do a force push on that branch now. Will it keep the references to the related commits when the commit hash changes?

@hadim
Copy link

hadim commented Oct 13, 2015

I don't know why travis is failing. Rebase is often needed before merging to master. And yes the commit history will be kept.
Let's wait for @jgm to comment on the PR and handle travis stuff !

Use the images alt text as a short caption for the list of figures in
latex documents.

    ![la lune](lalune.jpg "Voyage to the moon")

Should now result in:

    \caption[Voyage to the moon]{la lune}

in latex figures.
@waeltken waeltken force-pushed the short_caption_latex_figures branch from 67f7c41 to f05879d Compare October 13, 2015 12:34
@jgm
Copy link
Owner

jgm commented Oct 13, 2015

+++ waeltken [Oct 13 15 05:16 ]:

I assume that this line here:
short <- stringToLaTeX TextString tit

does not produce the required markup. But I don't know how to parse
that part of the title correctly at the moment. I guess our maintainer
John might know. 😉 I'll look in to it later today or this week.

That just does escaping that's needed for plain string
content in LaTeX (e.g. \% for %). It doesn't parse
as Markdown.

Note that in Markdown images

![alt](url "title")

the "alt" part is the alt text and the "title" part is the
title. The "alt" part does get parsed as Markdown, but the
"title" part is just a plain string.

@waeltken
Copy link
Author

Note that in Markdown images

![alt](url "title")

the "alt" part is the alt text and the "title" part is the
title. The "alt" part does get parsed as Markdown, but the
"title" part is just a plain string.

@hadim: So this would mean that we can not have markup e.g. italics in the title here then.

@jgm: So do you see this as a useful addition? I mean the semantics to use the markdown title as a title for the list of figures in latex does not seem wrong to me. Of course the current version of pandoc does not support a list of figures, but people employing a specialized template might use it.

@waeltken waeltken force-pushed the short_caption_latex_figures branch from f05879d to 724f041 Compare October 14, 2015 12:49
@waeltken
Copy link
Author

@hadim wrote:

I don't know why travis is failing.

There was an internal server error on the appveyor ci server. So i pushed the branch again to trigger ci again.

It should pass now.

@jgm
Copy link
Owner

jgm commented Dec 3, 2015

I'm not sold on this. Why is it needed? Can you give a realistic example?

@el-ee
Copy link

el-ee commented Dec 3, 2015

Well since I get these emails, I'll chime in I guess. I used pandoc
markdown for my phd thesis, and would have appreciated this feature. As it
was, i had to go in by hand in the final version after converting it to
latex & add alt captions to all the images so that the figure list up front
with the TOC had reasonable one-line names in it, but figures in the body
of the document could be captioned with a more descriptive two or three
sentences.

On Thu, Dec 3, 2015, 10:41 AM John MacFarlane notifications@github.com
wrote:

I'm not sold on this. Why is it needed? Can you give a realistic example?


Reply to this email directly or view it on GitHub
#2447 (comment).

.. typed on a tiny virtual keyboard
.. the usual requests for generosity in reading

@hadim
Copy link

hadim commented Dec 3, 2015

Well I use it to have different caption below the figures and in the \listoffigures section. For example for my phd thesis I use a lot of figures with very long caption and I don't want to display all the captions in \listoffigures. This is why I use short title mechanism to be able to display only a short sentence summarizing the figure.

If you're against this feature, just tell me and I will maintain a patch for each new pandoc version.

Thank you for your time anyway :-)

@jgm
Copy link
Owner

jgm commented Dec 3, 2015

OK, I see the point.

My only reservation is that it might seem surprising to
overload the title in this way. Lots of people may use
a blank title, or they may want the title to have something
else, and it may be unexpected/break existing workflows
to have it become the short title.

What about making the first sentence of the caption into the
short title? Would that work or would it still be too long?

Another possibility would be to look for a span with class
"short-caption" inside the caption; if found, it could be
used as the short caption.

Example:

![<span class="short-caption">This is my figure.</span>
It has a much longer caption, but only that first
bit goes into the short caption.](url)

Thoughts?

+++ Hadrien Mary [Dec 03 15 10:49 ]:

Well I use it to have different caption below the figures and in the
\listoffigures section. For example for my phd thesis I use a lot of
figures with very long caption and I don't want to display all the
captions in \listoffigures. This is why I use short title mechanism to
be able to display only a short sentence summarizing the figure.

If you're against this feature, just tell it and will maintain a patch
for each new pandoc version.

Thank you for your time anyway :-)


Reply to this email directly or [1]view it on GitHub.

References

  1. Add support for alt text as short title in latex #2447 (comment)

@hadim
Copy link

hadim commented Dec 3, 2015

Well I don't have any specific preferences about this. I am ok as I can specify a short caption :-)

Maybe we should ask to @el-ee and @waeltken .

@el-ee
Copy link

el-ee commented Dec 3, 2015

Pretty agnostic on specific syntax.

Automatically using first sentence would be better than nothing, but a way
of specifying lof titles -- span or alt text or otherwise -- would be
preferable to me.

Thanks for considering it further!

On Thu, Dec 3, 2015, 12:19 PM Hadrien Mary notifications@github.com wrote:

Well I don't have any specific preferences about this. I am ok as I can
specify a short caption :-)

Maybe we should ask to @el-ee https://github.com/el-ee and @waeltken
https://github.com/waeltken .


Reply to this email directly or view it on GitHub
#2447 (comment).

.. typed on a tiny virtual keyboard
.. the usual requests for generosity in reading

@waeltken
Copy link
Author

waeltken commented Dec 4, 2015

Hey,

so my feeling to this is that introducing the <span> tag would break existing text for people so crazy as to use such tag for an image title. I hope that would rarely be the case though.

Although i liked the current version using the alt text the most, I've changed my workflow to use pandoc to create a tex file and then compile with xelatex using full blown latex figures. Anything else would not do for my thesis because i needed subfigures etc..

So I think the final decision should be made by the maintainer, since he has to live with it. 😉

@hadim
Copy link

hadim commented Feb 21, 2016

Hi !

Hope you will be able to merge this soon :-)

@kleinschmidt
Copy link

Ditto. This would be really handy.

@crmackay
Copy link

can this be accomplished a bit more explictly by using a "short-caption" or "short" attribute, which could be specially handled by the latex writer, much like "width" and "height" are specially handled attributes?

example:

![This is a long caption](path/to/image.jpg){#myID short-caption="This is a short caption"}

would get you

\begin{figure}[htbp]
\centering
\includegraphics{path/to/image.jpg}
\caption[this is a short caption]{This is a long caption}\label{myID}
\end{figure}

@blake-riley
Copy link

Hi, just wanted voice my support, and say that this feature would be pretty handy.
I'm a fan of @crmackay 's suggested attribute syntax, but any way of being able to specify short titles for lof or lot would be great.
Thanks!

@blake-riley
Copy link

I needed this, so I hacked together a patch to the current HEAD commit a088d67.

This patch enables the attribute-style syntax discussed above. i.e.:

![This a really really long caption](path/to/image.jpg){#fig:ID lof-caption="Short capt for _LoF_"}

Note that the lof-caption value will be interpreted from Markdown (using a fairly restrictive interpretation):
(Ext_raw_tex, Ext_tex_math_dollars, Ext_all_symbols_escapable, Ext_strikeout, Ext_superscript, Ext_subscript, Ext_smart).

I have had a think about how to apply this to Tables, but it's definitely beyond my Haskell capabilities for the moment.

Notes:

  • Everything is in the LaTeX writer.
    • I wanted to still parse some Markdown, though, so I import the Markdown reader into the LaTeX writer.
      This is super hacky, and I apologise to jgm for so blatantly ignoring the architecture of Pandoc.
  • When it is implemented for Tables, the lof-caption attribute name should probably be revised for uniformity.

@gfelbing
Copy link

gfelbing commented Oct 2, 2017

Any news on this?
I have the same issue in the thesis I am currently writing.
My workaround is to use raw latex figures, but with this I cannot use pandoc-citeproc nor example lists anymore, as they don't work within raw latex and both are needed in the figure captions.
Same thing holds for tables.

Without markdown tables, figures, example lists and citations, the IMO most useful features of pandoc are unavailable for me, which is a real pitty.

@mb21
Copy link
Collaborator

mb21 commented Oct 2, 2017

as a workaround, you can write a filter, see #3682

@jgm
Copy link
Owner

jgm commented Oct 3, 2017

I definitely see the need for this, and many have chimed in that it would be useful.

Putting the short caption in an attribute has the same drawback as using the title for this. Both are just plain strings, not parsed as formatted Markdown. This is too limiting, I think: you might, for example, want to have some math in a short caption.

One possible solution is to create the short caption by concatening all the text in a span with class "short-caption". So, for example:

![[This is my figure.]{.short-caption}
It has a much longer caption, but only that first
bit goes into the short caption.](url "title")

Maybe there are other possibilities along similar lines. If so, they should be suggested here. Or, if this is satisfactory, we could implement it fairly easily.

@gfelbing
Copy link

The suggested solution would work for me.

@mb21
Copy link
Collaborator

mb21 commented Oct 17, 2017

From an AST point of view, wrapping the image in a span is not too bad... from a markdown syntax point of view I find it quite ugly an error-prone.

Are short-titles actually needed for all images, or only for figures? If the latter, maybe this should be fixed as part of the proposed native figure element.

@leewalsh
Copy link

Consider this my vote for this feature in some form. My favorite options are:

  1. Figure title text as short caption. I would rather give up markdown processing in the short caption than only ever be allowed to use a single period in the short caption.

  2. First Sentence as Short Caption. For my figures, I tend to write captions with the first "sentence" being a title, so I like that idea, but it is restrictive for those who don't use that style.

  3. Span

    One possible solution is to create the short caption by concatening all the text in a span with class "short-caption".

    I would tolerate this, but would probably rather just use a filter. Better than nothing though!

Aside: If anybody has made or found a good filter for this purpose, please share! My thesis is due this month 😱 and my list of figures is 7 pages long.

@svenevs
Copy link
Contributor

svenevs commented Nov 21, 2017

See improved filter below for a filter instead!!!

See comment below for a filter instead!!

Aside: If anybody has made or found a good filter for this purpose, please share!

I cheated the system. All of my captions look something like this:

![SHORTMARK:This is my short caption. This is a really long caption that is almost certainly going to `cause` me problems I guess because $x^2 \leq x^3$?](path/to/figure)

My build system involves a generate.py which is responsible for calling pandoc with the input documents. I do it this way because I'm very comfortable with string processing in python, but just pick your poison. For example, if you currently have a shell script but want to use the code below, just pipe the shell script into a python script (or use sed or any number of other tools). Now that I've captured all of the output from pandoc I have it available as a string. For the figures, I did this

# create short captions based on first sentence, all captions of all figures
# have a single sentence ending with a *period*
def short_caption(match):
    return "caption[{0}]{{{0}.".format(match.groups()[0])

# output is the string that is the output of `pandoc` on `stdout`
output = re.sub(r"caption{SHORTMARK:((?:[^.]+))[.]", short_caption, output)

It's taking something of the form \caption{SHORTMARK:My short sentence. My long sentence} and turning it into \caption[My short sentence]{My short sentence. My long sentence}.

This was based off the first-sentence idea somewhere above. I forget where I saw it, but somewhere else somebody said to make a filter. I just want to point out that no filter can handle this kind of substitution, don't waste your time (I did, and it was pointless) (it seems I still have a lot to learn about filters, see example below). This kind of trick can be quite useful for wielding pandoc, but it is also liable to break with future updates etc. You've been warned xD

So basically, if you really need it just create a unique marking scheme that you can perform a text replacement on later. Hope that is more helpful than confusing!

@leewalsh
Copy link

Thanks, @svenevs I'll look into that!

I forget where I saw it, but somewhere else somebody said to make a filter. I just want to point out that no filter can handle this kind of substitution, don't waste your time (I did, and it was pointless).

That was likely this comment at #3682. Maybe a filter is more feasible for headers or other simpler objects than for images/figures?

@tarleb
Copy link
Collaborator

tarleb commented Nov 21, 2017

For the time being, here is a lua filter to do it. It also parses the title as markdown.

-- don't do anything unless we target latex
if FORMAT ~= "latex" then
  return {}
end

local latex_figure_start = [[
\begin{figure}
\centering
\includegraphics{%s}
]]
local latex_figure_end = '\\end{figure}'

function latex(str)
  return pandoc.RawInline('latex', str)
end

function make_caption(long_caption, short_caption)
  local caption = short_caption
  table.insert(caption, 1, latex('\\caption['))
  table.insert(caption, latex(']{'))
  for i = 1, #long_caption do
    caption[#caption + 1] = long_caption[i]
  end
  table.insert(caption, latex('}\n'))
  return caption
end

function is_image_with_title(img)
  return img.t == "Image" and img.title
end

function Para(para)
  local img = para.content[1]
  if not (#para.content == 1 and is_image_with_title(img)) then
    return nil
  end
  local title = img.title:gsub('^fig:', '')
  local title_inlines = pandoc.read(title).blocks[1].content
  local figure = make_caption(img.caption, title_inlines)
  local fig_start = latex(latex_figure_start:format(img.src))
  local fig_end = latex(latex_figure_end)
  table.insert(figure, 1, fig_start)
  table.insert(figure, fig_end)
  return pandoc.Plain(figure)
end

Not the most beautiful, but should do what is asked for here.

@svenevs
Copy link
Contributor

svenevs commented Nov 22, 2017

@tarleb very cool! Would you be willing to include a brief example of what you use in the markdown code? I would like to understand how this filter works, it seems I should really learn haskell lua (? -- how are you using lua as a filter ?)...I don't understand where table.insert comes from, but it doesn't seem to be available on the python side.

@tarleb
Copy link
Collaborator

tarleb commented Nov 22, 2017

Lua filters were added to pandoc with release 2.0. Pandoc already had a Lua intepreter baked in (for custom writer), and we now enabled users to utilize it for filters as well. That's why it doesn't show up the "wrappers and interfaces" page.

The code above could be saved to file short-captions.lua and be invoked by calling pandoc with the option --lua-filter=short-captions.lua. See the lua filters docs for details.

An example would be the one given in the first comment here:

![la lune](lalune.jpg "Voyage to the moon")

or, with some math in the short caption:

![Parabola](parabola.svg "Graph for $x^2$")

@tarleb
Copy link
Collaborator

tarleb commented Nov 22, 2017

We should probably include this in @ickc's collection of lua filters. Here is a gist containing a slightly cleaner version.

@robinrosenstock
Copy link

@tarleb your filter, does not work with pandoc-crossref, or does it?

@robinrosenstock
Copy link

Sorry, I have overlooked your cleaner version from #2447 (comment) , which seems to work with pandoc-crossref.

@tarleb
Copy link
Collaborator

tarleb commented Jan 30, 2018

There's now a better and maintained version by @gtuckerkellogg in the pandoc/lua-filters repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.