Support for table column spans, table attributes in AST #1024

dashed · 2013-10-17T00:35:54Z

I tried looking for this within the Pandoc's docs. None of the flavours of markdown tables support column spanning. I don't think there are known markdown flavours that support column spanning except for multimarkdown.

Are there plans to support this?

jgm · 2013-10-17T01:57:31Z

+++ Alberto Leal [Oct 16 13 17:35 ]:

I tried looking for this within the Pandoc's docs. None of the flavours of markdown tables support column spanning. I don't think there are known markdown flavours that support column spanning except for multimarkdown.

Are there plans to support this?

Long-term, yes, I'd like to.

jokogr · 2015-10-22T15:00:00Z

Are there any plans for this? I'm also interested in this.

jgm · 2015-10-22T16:57:23Z

+++ jokogr [Oct 22 15 08:00 ]:

Are there any plans for this? I'm also interested in this.

Yes, it would be good to do, but it's a big change as it
requires changes in the underlying document model.

jokogr · 2015-10-23T16:27:40Z

Is there anything I could do to speed this up?

adius · 2015-11-12T15:13:21Z

+1

colourwonder · 2015-12-23T00:12:08Z

+1

brianfeister · 2016-03-10T05:57:20Z

+1

ousia · 2016-03-10T06:03:44Z

Are there any plans for this? I'm also interested in this.

@jokogr, sorry for my obvious reply: providing a patch may help.

jgm · 2016-03-10T06:54:25Z

+++ Pablo Rodríguez [Mar 09 16 22:03 ]:

Are there any plans for this? I'm also interested in this.
[1]@jokogr, sorry for my obvious reply: providing a patch may help.

This would require some major architecture change,
including changes in pandoc-types, all readers and writers.

brianfeister · 2016-03-10T16:50:57Z

@jgm is there any way to achieve this with a two-step process? Compile multimarkdown to an intermediate state and then that result with pandoc?

jgm · 2016-03-10T16:53:56Z

+++ Brian Feister [Mar 10 16 08:50 ]:

[1]@jgm is there any way to achieve this with a two-step process?
Compile multimarkdown to an intermediate state and then that result
with pandoc?

No, the problem is very simple. Pandoc's internal document
model doesn't allow colspans or rowspans. It's on the list
of things to improve.

ickc · 2016-10-08T20:46:11Z

Sidenote: probably this issue should be applied the label AST change.

In an attempt to make a suggestion here, I stumbled on issue #3154: pandoc "almost" has a 5th table extension: using native HTML as table. If it were true:

then after the AST changed to allow colspan and rowspan, then before we settled a syntax(es) for them, we can immediately start using it. For example, in md to LaTeX conversion, it can eats the colspan and rowspan and spills the multicolumn and multirow.

The reason for this suggestion is that settling for a syntax(es) is often tricky and requires a lot of discussions (except for possibly "mmd_colspan" since it is already there). But if it only requires the AST change (which is a prerequisite anyway for the new syntax(es) as explained) would make it easier.

jgm · 2016-10-09T20:16:44Z

An AST change to tables would require changes in both
readers and writers. You're right that we would not
necessarily need to support a native Markdown table
syntax with rowspans and colspans right away. We
could just implement HTML tables. But we'd still need
to change ALL the writers to handle rowspans and colspans,
since these would be in the basic table model. That's
already somewhat daunting (I suppose Markdown could fall
back to HTML, but RST couldn't).

ickc · 2016-10-12T00:47:09Z

@jgm said,

But we'd still need
to change ALL the writers to handle rowspans and colspans,
since these would be in the basic table model.

I don't know much about the design of the AST. Can a new AST including column & row span be a superset of the current AST?

If so, the transition period can be made smoother, i.e. a gradual roll out of the feature rather than changing the AST and all the writers & readers at the same time:

The AST can be changed first. Since it is the superset of the original, every reader/writer would still works
The column/row span extension can be activated by a feature flag of each writer and reader, documented in a matrix like this from OpenZFS.
In general, writer has higher priority to implement the feature flag than reader (except for markdown reader), since every existing reader generated a valid AST.
Only when both the from-format reader & to-format writer has the feature flag, the extension is activated.

lf-araujo · 2016-10-12T05:34:55Z

One of the reasons this feature is important is that in scientific setting whenever one needs to compare multiple groups, some kind of subheader is needed in the table. Having that, would permanently make pandoc place firm foot within the space of scientific writing.

jgm · 2016-10-12T07:38:16Z

+++ lf_araujo [Oct 11 16 22:34 ]:

One of the reasons this feature is important is that in scientific
setting whenever one needs to compare multiple groups, some kind of
subheader is needed in the table. Having that, would permanently make
pandoc place firm foot within the space of scientific writing.

Actually there are two issues here, right?

column spans
the ability to have multiple rows in a header

jgm · 2016-10-12T07:48:56Z

No, changing the table AST would definitely require changes
to all writers and readers, immediately. The writers would
all have to know what to do when they encounter colspans.

ickc · 2016-10-12T08:15:39Z

I don't understand. Can we put a switch in the reader such that when a pandoc command is used, knowing the output writer do not understand colspans, then the reader do not parse colspans. It seems like somekind of switch are already being used, say when some extensions is turned off in the command line.

I can see a problem might occur if the input format is AST or JSON where the reader cannot (as least difficult to) switch off the colspan/row features. But people should know what they are doing if AST/JSON is used (and show an error message to them).

lf-araujo · 2016-10-12T08:35:19Z

Actually there are two issues here, right?

Yes. The ability to format subheaders would be also needed.

jgm · 2016-10-12T08:57:21Z

+++ ickc [Oct 12 16 01:15 ]:

I don't understand. Can we put a switch in the reader such that when a
pandoc command is used, knowing the output writer do not understand
colspans, then the reader do not parse colspans. It seems like somekind
of switch are already being used, say when some extensions is turned
off in the command line.

No, readers are always independent of writers. But you're
not seeing the problem. Even if we had a switch that
dependend on the output format, the readers would still need
to be rewritten because of changes in the Pandoc type
(specifically the Table constructor).

ickc · 2016-10-12T09:18:54Z

I see. Then that is really a huge task. And I guess since changing AST would have compatibility issue, one didn't want to change that often, which means the strategy is probably to do serveral important AST change at once, which would make it even a bigger challenge.

And since AST change will break backward compatibility, it is safe to say it will only be in pandoc v2.0? In that case should a milestone be setup (even with no deadline), and add those important AST change to it (among other things).

lf-araujo · 2016-10-13T02:04:19Z

Thanks for the attention. I will leave two models of tables that are prevalent in papers. The first should be approachable in future iterations of pandoc, the second one, however, is a little more tricky and may not.

| Area                      |  Subjects       |      Controls   |
|---------------------------|-----------------|-----------------|
|                           |SD | se | p-value|SD | se | p-value|
|===========================|=================|=================|
| Standardised coefficients                                   |||
|===========================|=================|=================|
| Left fusiform area        | 1 | 2 | .05     | 3 | 4 |  .05    |
| Right insula              | 5 | 6 | .05     | 7 | 8 |  .05    |
| Left insula               | 5 | 6 | .05     | 7 | 8 |  .05    |
| Right fusiform area       | 1 | 2 | .05     | 3 | 4 |  .05    |
|===========================|=================|=================|
| Factor loadings                                             |||
|===========================|=================|=================|
| X                         | 1 | 2 | .05     | 3 | 4 |  .05    |
| Y                         | 5 | 6 | .05     | 7 | 8 |  .05    |
| Z                         | 5 | 6 | .05     | 7 | 8 |  .05    |

The equals were used to represent the bits that usually have lines to separate the subreader.

The second trickier table occurs when one wants to span vertically two cells. This is not essential, I am putting as an example of a common types of tables (a table for description of the population in this case).

| **Variables**                  |  **Healthy subjects (mean)** |   **Patients (mean)**   | **p-value**  |
|--------------------------------|------------------------------|-------------------------|--------------|
| **Age**                        |       11                     |        28.51            |    .01^(U)^  |
| **Gender**                                                                                          ||||                                                                     
| Male (%)                       |        12%                   |      99%                |              |
| Female (%)                     |        13%                   |         88%%            |  .99^(a)^    |                                   
| **Time from onset (days)**     |  NA                          |        111              |              |
| **Education (mean, in years)** |  10                          |     5                   |   .11 ^(U)^  |

What typically happens is a merge of the cell containing the value .99 and the cell above. That statistics concerns both Male and Female. I hope I am being clear.

jgm · 2016-11-07T08:47:34Z

Pandoc currently allows at most one header row, which must be at the top. A rule is inserted below it in default LaTeX output.

One could try to separate conceptually between being a header cell and having a rule under, so that a cell could have one of these properties without the other. Perhaps the idea in @lf-araujo's example is that a rule of hyphens --- divides the header from the rest, while a rule of equals === indicates a rule? But do the hyphens also cause a rule to be rendered?

welly · 2019-09-25T08:51:17Z

I've not read this thread entirely but I'm currently using pandoc (2.7.2) with wkhtmltopdf and am finding that the same issue is occurring. I wondered if anyone can explain why this won't work when using wkhtmltopdf to generate the html -> pdf conversion?

Thanks very much!

ickc · 2019-09-25T18:26:11Z

@welly, question like this might fit pandoc-discuss better. The issue tracker is for discussing the feature request. Currently the pandoc AST doesn't have a model for this so it isn't supported in pandoc yet.

rickywu · 2020-01-03T09:36:17Z

Really cry for this feature.
Or any other tools to convert pandoc output of gfm to this format:

|  | 2 | 3 |
| --- | --- | --- |
| a | @cols=2: |
| b |  | test<br>ricky |
| c | @rows=2: | <br>Yes |
| e | No |

petterreinholdtsen · 2020-02-11T10:01:06Z

I went looking for multi-span rows and columns in pandoc, and ended up here. My interest is not with the science community, but in the writing of government standards. We are currently experimenting with writing the standard texts in markdown and rst, and use pandoc to convert them to PDF (via docbook). But the lack of rowspan support in the tables make it impossible to represent the layout of the original docx. It would thus would be great if pandoc supported spans. Sorry for not having any code to contribute, but thought it would be useful to know about yet another use case.

glenpike · 2020-02-13T11:15:32Z

I found this after a search about formatting / laying out tables also.
We're looking at CMS options and how to generate the following table (as an example) would be useful as we require some of our tables to be pivoted 90 degrees for simple stuff:

<table>
  <caption>Dates and amounts</caption>
  <thead>
    <tr>
      <th scope="col">Date</th>
      <th scope="col">Amount</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th scope="row">First 6 weeks</th>
      <td>£109.80 per week</td>
    </tr>
    <tr>
      <th scope="row">Next 33 weeks</th>
      <td>£109.80 per week</td>
    </tr>
    <tr>
      <th scope="row">Total estimated pay</th>
      <td>£4,282.20</td>
    </tr>
  </tbody>
</table>

The markup is based on the UK Gov's Design System: https://design-system.service.gov.uk/components/table/ - but we have a use case for these types of tables.

ivan4th · 2020-03-09T12:31:16Z

A strong use case for this are some 3GPP specs that contain bit fields, e.g. https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3111
Would love to be able to view them as markdown or org in my Emacs, but all of the bitfields are off ;(

jgm · 2020-04-03T16:29:03Z

For those who have been following this issue, we have a PR for new table types here:
jgm/pandoc-types#66
I want to avoid excessive bikeshedding on this issue, but if you have final comments, now is the time. The type allows for column and row spans, short captions, attributes, multiple header rows, footers, intermediate headers, and overriding alignments at the cell level.

bpj · 2020-04-03T16:40:26Z

Is the Markdown syntax for these new features described in prose/with examples somewhere?

jgm · 2020-04-03T16:45:23Z

There's no markdown syntax. That's a separate issue, and it may be quite a while before that changes. You shouldn't even assume that we will provide a markdown syntax capable of representing all these distinctions. For now the focus is just on providing types capable of representing more complex tables, which can certainly be represented in other formats.

lrosenthol · 2020-04-30T12:53:21Z

@jgm is there a timeframe for a binary release with these great improvements?? and is there a good way to contribute for input/output format handling?

jgm · 2020-04-30T16:17:11Z

This can be closed now: we have these features in the AST.
We don't yet have support for them in readers and writers, though.
I've opened some issues for adding these to some of the most commonly used formats (you'll find them among the most recent issues).

ickc · 2020-05-01T01:02:01Z

Would documentations on this AST be available? And is there's any pointers on filter frameworks to pick this up (such as pandocfilter/panflute)?

Thanks @despresc for the effort of implementing it and @jgm, @mb21 for the code reviews. This is great progress!

jgm · 2020-05-01T02:21:39Z

There will be regular API docs once the new pandoc-types is released.

reagle · 2020-05-15T12:22:21Z

I've opened some issues for adding these to some of the most commonly used formats (you'll find them among the most recent issues).

mpickering added enhancement complexity:high status:more-discussion-needed labels Dec 8, 2014

tarleb added the AST change label Oct 9, 2016

jgm mentioned this issue Nov 7, 2016

Support for advanced table formatting #1930

Closed

This was referenced Dec 9, 2016

HTML Table colspan not support when converting to EPUB #1340

Closed

pandoc doesn't parse rst table correctly? #1159

Closed

This was referenced Jun 25, 2019

Clean up table headers and column merging slochower/smirnoff-host-guest-manuscript#56

Closed

merging adjacent table cells manubot/rootstock#240

Closed

jgm mentioned this issue Aug 21, 2019

Feature Request: Switch for placement of table captions before or after the table #2868

Closed

gnott mentioned this issue Sep 3, 2019

Test parsing table-wrap from .docx file elifesciences/decision-letter-parser#14

Closed

mb21 mentioned this issue Jan 3, 2020

From docx to gfm with merged cells table #6025

Closed

rickywu mentioned this issue Feb 22, 2020

Multiple lines table support nhn/tui.editor#305

Closed

despresc mentioned this issue Mar 21, 2020

Improving tables jgm/pandoc-types#65

Closed

jgm closed this as completed Apr 30, 2020

ickc mentioned this issue Jul 2, 2020

Supporting pandoc 2.11 sergiocorreia/panflute#142

Closed

m-rossi mentioned this issue Aug 10, 2020

Mutli-column tables broken in export m-rossi/jupyter-docx-bundler#24

Closed

danlobo02 mentioned this issue Aug 24, 2020

ICML Writer - Support new table features #6615

Open

fkohrt mentioned this issue Sep 8, 2020

Headerless tables not formatted correctly rstudio/rmarkdown#1893

Closed

3 tasks

sbarral mentioned this issue Sep 27, 2020

Update pandoc-ast to 1.21 oli-obk/pandoc-ast#6

Closed

ickc mentioned this issue Nov 10, 2020

Supporting pandoc 2.11 ickc/pantable#51

Closed

ickc mentioned this issue Nov 28, 2020

Supporting pandoc-types 1.22 elliottslaughter/rust-pandoc-types#2

Closed

xeruf mentioned this issue Feb 4, 2022

table.el support Jason-S-Ross/ox-context#36

Open

Support for table column spans, table attributes in AST #1024

Support for table column spans, table attributes in AST #1024

Comments

dashed commented Oct 17, 2013

jgm commented Oct 17, 2013

jokogr commented Oct 22, 2015

jgm commented Oct 22, 2015

jokogr commented Oct 23, 2015

adius commented Nov 12, 2015

colourwonder commented Dec 23, 2015

brianfeister commented Mar 10, 2016

ousia commented Mar 10, 2016

jgm commented Mar 10, 2016

brianfeister commented Mar 10, 2016

jgm commented Mar 10, 2016

ickc commented Oct 8, 2016

jgm commented Oct 9, 2016

ickc commented Oct 12, 2016

lf-araujo commented Oct 12, 2016

jgm commented Oct 12, 2016

jgm commented Oct 12, 2016

ickc commented Oct 12, 2016

lf-araujo commented Oct 12, 2016

jgm commented Oct 12, 2016

ickc commented Oct 12, 2016

lf-araujo commented Oct 13, 2016 • edited Loading

jgm commented Nov 7, 2016

welly commented Sep 25, 2019 • edited Loading

ickc commented Sep 25, 2019

rickywu commented Jan 3, 2020 • edited Loading

petterreinholdtsen commented Feb 11, 2020

glenpike commented Feb 13, 2020

ivan4th commented Mar 9, 2020

jgm commented Apr 3, 2020

bpj commented Apr 3, 2020

jgm commented Apr 3, 2020

lrosenthol commented Apr 30, 2020

jgm commented Apr 30, 2020

ickc commented May 1, 2020

jgm commented May 1, 2020

reagle commented May 15, 2020

lf-araujo commented Oct 13, 2016 •

edited

Loading

welly commented Sep 25, 2019 •

edited

Loading

rickywu commented Jan 3, 2020 •

edited

Loading