Skip to content
Hudson Bailey edited this page Apr 10, 2022 · 141 revisions

Pandoc provides an interface for users to write programs (known as filters) which act on the intermediate AST. For more info see the filter tutorial and the Lua filter tutorial.

This page collects together third party filters which can be used to add functionality to pandoc.

Writing Filters

Filters can be written in any programming language. Pandoc wrappers and interfaces are available in the following programming languages to facilitate modification of the AST:

language link description
Python pandocfilters a library for writing pandoc filters in python.
Python panflute a pythonic alternative to pandocfilters, with batteries included. It reconstructs pandoc AST in an internal panflute AST which makes it more seamless in interacting with the AST. (@jgm recommended this in pandoc discuss)
Python pantable specialized in writing filter for tables based on panflute, which provides a lossless conversion between an internal structure and panflute AST.
PHP pandocfilters-php a port of the python pandocfilters module to PHP to make writing filters in PHP easier.
Node.js pandoc-filter-node a Node.js module for writing pandoc filters in JavaScript.
Perl Pandoc::Elements a CPAN module for writing pandoc filters in Perl.
Groovy groovy-pandoc a library for writing Pandoc filters in Groovy.
Ruby paru a Ruby gem to write pandoc filters in Ruby.
Lua pandoc's official documentation Pandoc includes a lua interpreter by default so is quite lightweight
Elixir Panpipe a library for writing pandoc filters in Elixir
.NET PandocFilters a NuGet package for writing Pandoc filters in .NET languages
OCaml ocaml-pandoc An OCaml library for writing pandoc filters.

Other tools:

  • vimhl, a vim plugin that makes vim syntax highlighting engine available in pandoc.
  • pandoc-jats, a Lua custom writer for Pandoc generating JATS XML.
  • 2bbcode, a Lua custom writer for BBCode.
  • pandocmeta.lua, a simple Lua package that converts Pandoc metadata types into a, possibly multi-dimensional, table.

Written Filters

See github.com/pandoc/lua-filters for some select filters written in Lua. Some other known 3rd party filters:

Document (DOCX/ODT) related

  • Because DOCX and ODT files cannot use templates, we are limited in how we can transform metadata into document content. Several paru filters can help to solve this, given a metadata format involving authors with affiliation/correspondence fields and institute information: README; and individual filters: simplifyMetadata, prependInstitute, prependKeywords, prependAbstract, prependComments --- filters combined: prependAll.
  • pandoc-odt-filters: filters that improve ODT output --- creates sequences in image and table captions (for automatic list-of-figures and list-of-tables), corrects links to images and tables, corrects bibliography style, custom styles to headers and spans, better list styles and real smallcaps. Some of the filters are configurable.
  • commentary: a Pandoc filter and command line tool that preserves native-style comments + metadata between Markdown/docx conversions.

Images related

  • pandoc-svg, a pandoc filter to convert svg files to pdf by Jerome Robert.
  • diagrams-pandoc for inserting images expressed in the Haskell diagrams DSL.
  • mermaid-pandoc for inserting images expressed in mermaid syntax
  • r-pandoc for inserting plots expressed in the R language
  • paru-screenshot.rb for automatically taking a screen shot of a web page and including that shot as an image in a markdown file.
  • pandoc-plot to generate and embed figures based on code blocks in documents, using a variety of toolkits (e.g. Matplotlib, MATLAB, gnuplot, ggplot2, etc.). Easy integration with Haskell libraries (e.g. Hakyll)

Numbering related

  • Numerical reference to sections, using a specified sign (by default #) in internal links. Metadata can configure special sign and whether links should be preserved or converted to plain text.
  • pandoc-fignos, for numbering figures and figure references.
  • pandoc-eqnos, for numbering equations and equation references.
  • pandoc-tablenos, for numbering tables and table references.
  • pandoc-crossref, for numbering and cross-referencing figures, equations and tables
  • pandoc-numbering, for numbering and cross-referencing any kinds of things such as examples, theorems, exercises and so on
  • pandoc-ling, for formatting, numbering and cross-referencing linguistic examples
  • pandoc-listof, for creating lists of any kinds (deprecated)
  • pandoc-amsthm: a pandoc amsthm package to define the use of amsthm through YAML front matter, target at HTML and LaTeX outputs. For HTML, CSS counter is used and defined in a template (by the YAML variables). For LaTeX amsthm package is used and defined in a template (by the YAML variables). - definitionlist-filter.lua, for converting some definition lists to theorem-like (amsthm) Environments and some references to cref tags in LaTeX

Math related

  • mathjax-pandoc-filter rendering math to SVG using mathjax-node
  • asciimathml-pandocfilter: to add read support for AsciiMathML syntax through conversion into LaTeX
  • pandoc-unicode-math replaces Unicode math symbols and greek letters like ∀, ∈, →, λ, or Ω in math environments by equivalent Latex commands like \forall, \in, \rightarrow, \lambda, or \Omega.
  • SugarTeX is a more readable LaTeX language extension and transcompiler to LaTeX. Fast Unicode autocomplete in Atom editor via SugarTeX Completions for Atom.

LaTeX related

RAW related

  • Include Files: finds all the inline code blocks with attribute include, and replaces their contents with the contents of the file given
  • code-includes.lua Include code from source files. Keep your examples and documentation compiled and in-sync. Similar to the above except you don't have to install Haskell and you can select by line number.
  • pandoc-dot2tex-filter - a filter that converts dot notation to PGF/TikZ graphics for latex/pdf rendering.
  • HTML comment to LaTeX comment: a filter that converts HTML comment to LaTeX comments

Tables related

Text related

  • pandoc-abbreviations allows the use of arbitrary abbreviations, defined in an abbreviations file or in the source document's YAML header, which are replaced on processing. Useful for maintaining consistency of terminology etc.
  • pandoc-acronyms is a filter for managing acronyms. It replaces acronyms like "FAQ" at first use with the full text "frequently asked questions (FAQ)". It is installed using pip.
  • count-para.lua add numbering to paragraphs to allow for detailed citation (in scientific context). Proposal to replace page-number referencing, which does not work with adaptive design.
  • pandoc-lang automatically detects the (natural) language of text, as well as the programming language of code blocks
  • pandoc-mustache replaces variables like {{varname}} in a pandoc document with their values, which are stored in a separate YAML file.
  • pandoc-quotes.lua and the older pandoc-quotes replace non-typographic, quotation marks with typographic ones for languages other than US English.
  • transclude.lua Include content from another file just like in AsciiDoc and ReST.
  • columns Multiple columns support.

Running Code related

  • R-pandoc for generating R plots
  • filter_pandoc_run_py for executing python codes written in code blocks and also embedding print output and pyplot figures
  • pandoc-plot to generate and embed figures based on code blocks in documents, using a variety of toolkits (e.g. Matplotlib, MATLAB, gnuplot, ggplot2, etc.). Easy integration with Haskell libraries (e.g. Hakyll)
  • Knitty: is a Pandoc filter for reproducible reports via Jupyter and Pandoc (Stitch's fork that is a Knitr-RMarkdown-like lib). Insert Python code (or other Jupyter kernel code) to the Markdown document or write in plain Python/Julia/R/any-kernel-lang with block-commented Markdown and have code's results in the Pandoc output document.
  • pandocsql which uses an in-memory SQLite database. It creates tables from tables in the document and executes queries in code blocks, showing the results as tables.
  • pannb, a pandoc filter to control the output from ipynb input, this includes metadata block, filter out Python code, and converting all raw-blocks to native pandoc AST. The 3 can be mixed and matched.

Citation related

  • pandoc-manubot-cite allows citing persistent identifiers directly like @doi:10/c7np or @pubmed:29618526. Removes the need for a reference manager by supporting DOIs, PubMed IDs, URLs, ISBNs, Wikidata IDs, and the hundreds of other ID types registered with https://identifiers.org. Written in Python. Available on PyPI.
  • pandoc-url2cite allows citing certain persistent identifiers directly (URLs, ISBNs, and DOIs). Basically a less opinionated and simpler version of pandoc-manubot-cite. Written in TypeScript. Available on npm.
  • pandoc-zotxt.lua looks up sources for citations in Zotero.

Others

  • Adding support for indexing with the syntax (# term, subterm) in html and latex
  • Adding non-breaking spaces inside a URL to preserve formatting
  • toc-css Lua filter changing the appearance of the Pandoc basic HTML table of contents by some CSS and vanilla Javascript.
  • lablinkfix updates links to the Swedish Labour Movement Archives and Library catalogues.
  • second-date changes date metadata to a different strftime format using python's dateutil.
  • pandoc_abnt allow to specify the source of images and tables, and automatically corrects Alineas according to Brazilian's standard for Academic writings (ABNT NBR 14724:2011).
  • nheengatu provides several resources for publishing multimedia content through formats such as LaTeX, HTML and EPUB.
  • code-includes.lua Include code from source files. Keep your examples and documentation compiled and in-sync.
  • pandoc-logic-proof provides a way to write logic proofs in pandoc markdown and produce attractive output.