Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not reading image links in org-mode #5454

Open
martinfowler opened this issue Apr 17, 2019 · 11 comments
Open

Not reading image links in org-mode #5454

martinfowler opened this issue Apr 17, 2019 · 11 comments

Comments

@martinfowler
Copy link

Pandoc doesn't seem to be recognizing image links in org documents.

Taking this org input

* Test Images

** PDF no label

[[file:test.pdf]]

** PNG no label

[[file:test.png]]

** PDF with label

[[file:test.pdf][label]]

** PNG with label

[[file:test.png][label]]

I get the following output

foo.pdf

Only the png file is properly rendered, and then only when it has no label.

I get the same output when I produce a docx file, and something similar if I generate markdown

Test Images
===========

PDF no label
------------

[file:test.pdf](test.pdf)

PNG no label
------------

![](test.png)

PDF with label
--------------

[label](test.pdf)

PNG with label
--------------

[label](test.png)

I'm running pandoc 2.7.2 on macOs Mojave 10.14.4. I installed pandoc via homebrew

@jgm
Copy link
Owner

jgm commented Apr 17, 2019

Looks like isImageFilename in Text.Pandoc.Readers.Org.Shared doesn't recognize the pdf extension as an image filename. I'm not sure about the label issue. @tarleb can you have a look?

@martinfowler
Copy link
Author

I haven't done a full test of image file types, but eps has the same problem as pdf

@jgm
Copy link
Owner

jgm commented Apr 18, 2019

(defun org-display-inline-images (&optional include-linked refresh beg end)
  "Display inline images.

An inline image is a link which follows either of these
conventions:

  1. Its path is a file with an extension matching return value
     from `image-file-name-regexp' and it has no contents.

  2. Its description consists in a single link of the previous
     type.

When optional argument INCLUDE-LINKED is non-nil, also links with
a text description part will be inlined.  This can be nice for
a quick look at those images, but it does not reflect what
exported files will look like.

When optional argument REFRESH is non-nil, refresh existing
images between BEG and END.  This will create new image displays
only if necessary.  BEG and END default to the buffer
boundaries."

image-file-name-regexp is an elisp function that returns:

"\\.\\(GIF\\|JP\\(?:E?G\\)\\|P\\(?:BM\\|GM\\|N[GM]\\|PM\\)\\|SVG\\|TIFF?\\|X\\(?:[BP]M\\)\\|gif\\|jp\\(?:e?g\\)\\|p\\(?:bm\\|gm\\|n[gm]\\|pm\\)\\|svg\\|tiff?\\|x\\(?:[bp]m\\)\\)\\'"

@tarleb
Copy link
Collaborator

tarleb commented Apr 18, 2019

An issue I faced while implementing this is that Emacs Org mode handles this case differently, depending on the chosen exporting format. Pandoc's org reader had to be target-format agnostic, and we chose HTML exporting as reference behavior. Exporting to HTML from Emacs results in output similar to the above, while exporting to PDF will result in an image instead of a link under "PDF no label".

Labeled image links will be output as links in all target formats.

This situation is not ideal, but I'm unsure how to improve this in a consistent way. I'm very open to suggestions.

@martinfowler
Copy link
Author

martinfowler commented Apr 18, 2019 via email

@tarleb
Copy link
Collaborator

tarleb commented Apr 18, 2019

I see what you mean. Org-mode syntactically conflates links and images in interesting (and to me, sometimes unintuitive) ways. I believe there is no Org mode equivalent to Markdown's ![label](path/to/image). The Org mode manual states:

An image is a link to an image file that does not have a description part.

The closes we can get is by using figures:

#+CAPTION: label
[[/path/to/image]]

Demo:

$ echo "#+CAPTION: label\n[[file:test.png]]" | pandoc -f org -t markdown
![label](test.png)

So I'd reckon that a good conversion first has to determine what if the target is an image, then do whatever you do for markdown.

Pandoc's readers generally have no insights into the target format and output options, but the reader (i.e., parser) already has to decide which links are actually images. With Emacs Org-mode, PDF and eps are images when outputting PDF via LaTeX, but cannot be displayed as images when exporting to HTML. The reader has to make a choice between image or link, without information on the target format's capabilities; we follow Emacs' logic for non-PDF output for this.

A better way would be to preserve this information (link or image) in the intermediate document AST. Unfortunately, we currently have no way of expressing image when used with LaTeX, link otherwise in the AST. @jgm proposed a fix for this in #547. I'm afraid there is not much we can do about this discrepancy between pandoc and Emacs unless we perform these (extensive) changes.

@martinfowler
Copy link
Author

martinfowler commented Apr 18, 2019 via email

@jgm
Copy link
Owner

jgm commented Apr 19, 2019

Pandoc will parse

[[file:test.pdf]]

from org as

Link ("",[],[]) [Str "file:test.pdf"] ("test.pdf","")

So one approach would be to use a lua filter to transform the AST before rendering to LaTeX/PDF.

-- fixpdfs.lua
function Link(el)
  if el.title == '' and el.target:match('.*[.]pdf') and pandoc.utils.stringify(el.content) == "file:" .. el.target then
      return pandoc.Image(el.content, el.target)
  end
end

Invoke using

pandoc my.org -o my.pdf --lua-filter fixpdfs.lua

@martinfowler
Copy link
Author

martinfowler commented Apr 19, 2019 via email

@tbruckmaier
Copy link

tbruckmaier commented May 13, 2020

This seems also to be the case with uppercase image extensions:

$ echo "[[file:test.png]]" | pandoc -f org -t html
<p><img src="test.png" /></p>

$ echo "[[file:test.PNG]]" | pandoc -f org -t html
<p><a href="test.PNG">file:test.PNG</a></p>

$ pandoc --version
pandoc 2.9.2
Compiled with pandoc-types 1.20, texmath 0.12.0.1, skylighting 0.8.3.2

Is the list of valid image extensions configurable in any way?

@tarleb
Copy link
Collaborator

tarleb commented May 14, 2020

@tbruckmaier that's different to what's discussed here, please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants