Pandoc/Quarto filter for smart first-line indents in HTML and LaTeX/PDF.
Quarto/Pandoc's support of first-line indents is limited: it's not available in HTML output and delegated to LaTeX PDF output. This filter provides a first-line indentation style with smart defaults, full customization, and manual control for fine-grain adjustments.
Paragraphs are typically separated in either of two ways: by vertical whitespace (common on the web) or by indenting their first line (common in books). There is some variation in the first-line indent style itself: some apply it to every paragraph, others don't apply it to paragraphs below a section heading, blockquote or the like. They also vary in size, the most common being between half an em (the width of the letter 'm') for narrow text to 3 ems for wide text. 1 to 1.5em are probably the most standard values.
Quarto and Pandoc use vertical whitespace by default. In HTML
outputs that cannot be changed. In LaTeX/PDF output one can
switch to first-line indent by setting the metadata variable
indent
to true
. There are some limitations, however:
- standard English style is applied: no first-line indents after headings.
- But first-line indents are applied below titles. This LaTeX default isn't good typography: the first paragraph doesn't need a separation.
- Every line following a blockquote, list, code block or other block element is treated as a new paragraph, hence indented. This is most often (but not always) unwanted, as the text following a blockquote or list is usually a continuation of the same paragraph.
- The size of first-line indent is determined by the underlying
LaTeX document "class" used. The standard classes (
article
,book
) and Memoir (memoir
) use 1.5\ em, the KOMA classes (scrartcl
,scrbook
) 1\ em. Pandoc uses the standard article class by default, Quarto its KOMA equivalent, so you get 1.5\ em with the first and 1\ em with the second. You need to insert LaTeX code in your document to change this.
This filter provides first-line indentation in HTML output and improves its handling in both PDF and HTML outputs.
- First-line indentation is used to separate paragraphs, unless
indent
is set tofalse
. - It generates HTML outputs with first-line indent style. That is
done by appending CSS code in the document's metadata
header-includes
field. This can be disabled if you want to provide your own CSS. - You can keep or remove the indent of specific paragraphs manually,
by adding
\indent
and\noindent
at the beginning of the paragraph in the markdown source. These are LaTeX commands but will work with HTML output too. - First-line indentation is not applied certain block elements: by
default, not after lists, block quotes, code blocks and horizontal
rules. You can specify which through the filter's options. This can
be overridden on a per-paragraph basis by inserting
\indent
at the beginning of the paragraph. - The width of first-line indentations can be customized.
Install this filter in a document's folder by running:
quarto install extension dialoa/first-line-indent
on the command line (terminal in RStudio).
Use it by adding first-line-indent
to the filters
entry
of your YAML header.
---
filters:
- first-line-indent
---
Copy the file first-line-indent.lua
in your document folder. Pass
the filter to Pandoc via the --lua-filter
(or -L
) command
line option.
pandoc --lua-filter first-line-indent.lua ...
Or specify it in a defaults file (see Pandoc's manual: defaults).
You can place the filter file Pandoc's user data dir, or in an
arbitrary folder (-L path/to/first-line-indent.lua
). See Pandoc's
manual:Lua filters.
Copy the file first-line-indent.lua
in your document folder. Use
pandoc_args
to invoke the filter. See the R Markdown
Cookbook for details.
---
output:
word_document:
pandoc_args: ['--lua-filter=first-line-indent.lua']
---
You can place the filter in another folder, provided you specify its path:
---
output:
word_document:
pandoc_args: ['--lua-filter=../path/to/first-line-indent.lua']
---
See also the sample input file and the resulting HTML output.
To apply first-line indentation to your entire document, set indent
to true
in the YAML header:
---
indent: true
---
In Quarto, indent
may also be set per format:
---
format:
html:
indent: false
pdf:
indent: true
---
The filter applies some typesetting adjustments, e.g. no first-line indentation after lists. See [typesetting-background] below for details. If you're not happy with the adjustments, you can control them via options and manually apply or remove indents from some paragraphs.
Whether or not first-line indentation is activated for the whole
document, you can manually add or remove it from a particular
paragraph by inserting \indent
or \noindent
at the beginning of
the paragraph:
> This is a blockquote
\indent This paragraph will have an indent even though it follows a
blockquote.
Even though \indent
and \noindent
are LaTeX commands, the filter
handles them in HTML output too.
Warning: citations after \indent
. If the paragraph starts
with a square-bracketed citation, \indent
or \noindent
must
be marked as a "Raw Inline", as follows:
`\indent`{=tex} [@Smith2008] says....
That is because Pandoc/Quarto interprets \indent [@cite]
as a
LaTeX command with a bracketed option rather than a LaTeX command
followed by a citation.
Filter options are specified in the document's YAML header:
indent: true
first-line-indent:
size: 2em
auto-remove: true
set-metadata-variable: true
set-header-includes: true
remove-after:
- BlockQuote
- BulletList
- CodeBlock
- DefinitionList
- HorizontalRule
- OrderedList
dont-remove-after: Table
remove-after-class:
- statement
dont-remove-after-class:
Different options can be provided for different output formats. This is standard with Quarto, but the filter also reads these with Pandoc:
format:
html:
indent: true
first-line-indent:
size: 2em
pdf:
indent: true
first-line-indent:
size: 1.5em
Format-specific options override global ones. For instance, to disable first line indentation in HTML output only:
# Format-specific options
format:
html:
indent: false
first-line-indent:
set-header-includes: false
# Global options
indent: true
first-line-indent:
size: 2em
Options can be passed in a separate metadata file (Quarto, Pandoc or defaults (Pandoc only).
indent
(default true
)
: If set to false
, paragraphs are separated with vertical whitespace
rather than first line indentation. This essentially deactivates the
filter, though \indent
can still be used to add indent to
individual paragraphs for HTML output as well as PDF.
size
(default nil
)
: String specificing size of the first-line indent. Must be in a
format suitable for all desired outputs. 1.5em
, 2ex
, .5pc
,
10pt
, 25mm
, 2.5cm
, 0.3in
, all work in LaTeX and HTML. 25px
only works in HTML. LaTeX commands (\textheight
) are not
supported.
auto-remove
(default true
)
: Whether the filter automatically removes first line indentation from
paragraphs that follow blocks of given types, unless they start with
\indent
. Set to false
to disable. Use the remove-after...
and
dont-remove-after...
options below to control which block types
and Div classes are handled that way. By default first-line
indentation is removed after Blockquote, lists (DefinitionList,
BulletList, OrderedList, which include numbered example lists) and
HorizontalRule blocks.
set-metadata-variable
(default: true
):
: Whether the filter adds the metavariable indent
with the value true
when it
is missing. Without this Pandoc's LaTeX template does not use first-line
indentation in PDF output.
set-header-includes
(default true
)
: Whether the filter should
add formatting code to the document's header-includes
metadata
field. Set it to false
if you use a custom template instead.
remove-after
, dont-remove-after
: Whether to remove
first-line indentations automatically after blocks of a certain type.
These options can be a single string or a list of strings. The
strings are case-sensitive and should correspond to block types in
Lua filters:
BlockQuote, BulletList, CodeBlock, DefinitionList, Div, Header,
HorizontalRule, LineBlock, Null, OrderedList, Para, Plain, RawBlock,
Table. Inactive if auto-remove
is false.
remove-after-class
, dont-remove-after-class
: Decide whether to
remove first-line indentation automatically after elements of certain
classes. For instance, you may use decide that when a block with class
"continuing" is followed by a paragraph, the latter should not
be first-line indented. Useful for Div elements, if you use
Divs of certain classes to wrap and typeset
material that doesn't end a paragraph. Inactive if auto-remove
is
false.
To illustrate, suppose you don't want to filter to remove first-line indent after definition lists. You can add the following lines in the document's metadata block (if the source is markdown):
first-line-indent:
dont-remove-after: DefinitionList
In LaTeX output the filters adds \noindent
commands at beginning of
paragraphs that shouldn't be indented. These can be controlled in
LaTeX as usual.
In HTML output paragraphs that are explicitly marked to have no
first-line indent are preceded by an empty div
with class
no-first-line-indent-after
and those that are explictly marked (with
\indent
in the markdown source) to have a first-line indent are
preceded by an empty div
with class first-line-indent-after
, as
follows:
<ul>
<li>A bullet</li>
<li>list</li>
</ul>
<div class="no-first-line-indent-after"></div>
<p>This paragraph should not have first-line indent.</p>
...
<div class="first-line-indent-after"></div>
<p>This paragraph should have first-line indent.</p>
These can be styled in CSS as follows:
p {
text-indent: 1.5em;
margin: 0;
}
header p {
text-indent: 0;
margin: 1em 0;
}
:is(h1, h2, h3, h4, h5, h6) + p {
text-indent: 0;
}
li > p, li > div > p, li > div > div > p {
text-indent: 0;
margin-bottom: 1rem;
}
div.no-first-line-indent-after + p {
text-indent: 0;
}
div.first-line-indent-after + p {
text-indent: SIZE;
}
The first four rules provide global first line indentation.
- The
p
rule adds first-line indentation to every paragraph and removes the default vertical space between paragraphs. - The
header p
rule restores the default whitespace separation setting for paragraphs in the<header>
element. - The
is(h1, h2, h3, h4, h5, h6) + p
rule removes first-line indentation from every paragraph that follows a heading. - The
li > p
rule restores the vertical whitespace separation style within lists. It only targets paragraphs that are direct child of a list (li > p
) rather that all paragraphs within a list (li p
) in case a list item contains e.g. a block quote that requires first line indentation. However in case a list item's paragraphs are contained within some Div, we also target paragraphs that are child of a Div, or sub-Div, of a list item (li > div > p
andli > div > div >p
).
The last two rules provide explicit local indentation. The
div.no-first-line-indent-after) + p
rule removes indent from
paragraphs placed just after a Div with the
no-first-line-indent-after
class, and the second rule keeps them in
paragraphs that follow a first-line-indent-after
Div.
The indentation filter adds the following rule:
div.labelled-lists-list > p {
text-indent: 0;
}
To avoid interference with Dialoa's labelled-lists filter.
The filter applies first line indent style within block quotes, with no indent on the first line.
To achieve this in PDF output, the LaTeX quote
environment (used by
Quarto/Pandoc for block quotes) is redefined as follows in
header-includes
:
\renewenvironment{quote}
{\list{}{\listparindent 1.5em%
\itemindent \listparindent
\rightmargin \leftmargin
\parsep \z@ \@plus \p@}%
\item\relax}
{\endlist}
Which is the definition of LaTeX's quotation
environment---see the
standard classes source.
If you redefine the quote
environment, you should use this code as
basis.
The filter adds its commands at the beginning of the
header-includes
field. You can thus use header-includes
to
override the filter's commands.
Issues and PRs welcome.
Copyright 2021-2023 Julien Dutant. License MIT - see license file for details.