Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for table column spans, table attributes in AST #1024

Closed
dashed opened this issue Oct 17, 2013 · 107 comments
Closed

Support for table column spans, table attributes in AST #1024

dashed opened this issue Oct 17, 2013 · 107 comments

Comments

@dashed
Copy link

dashed commented Oct 17, 2013

I tried looking for this within the Pandoc's docs. None of the flavours of markdown tables support column spanning. I don't think there are known markdown flavours that support column spanning except for multimarkdown.

Are there plans to support this?

@jgm
Copy link
Owner

jgm commented Oct 17, 2013

+++ Alberto Leal [Oct 16 13 17:35 ]:

I tried looking for this within the Pandoc's docs. None of the flavours of markdown tables support column spanning. I don't think there are known markdown flavours that support column spanning except for multimarkdown.

Are there plans to support this?

Long-term, yes, I'd like to.

@jokogr
Copy link

jokogr commented Oct 22, 2015

Are there any plans for this? I'm also interested in this.

@jgm
Copy link
Owner

jgm commented Oct 22, 2015

+++ jokogr [Oct 22 15 08:00 ]:

Are there any plans for this? I'm also interested in this.

Yes, it would be good to do, but it's a big change as it
requires changes in the underlying document model.

@jokogr
Copy link

jokogr commented Oct 23, 2015

Is there anything I could do to speed this up?

@adius
Copy link

adius commented Nov 12, 2015

+1

2 similar comments
@colourwonder
Copy link

+1

@brianfeister
Copy link

+1

@ousia
Copy link
Contributor

ousia commented Mar 10, 2016

Are there any plans for this? I'm also interested in this.

@jokogr, sorry for my obvious reply: providing a patch may help.

@jgm
Copy link
Owner

jgm commented Mar 10, 2016

+++ Pablo Rodríguez [Mar 09 16 22:03 ]:

Are there any plans for this? I'm also interested in this.

[1]@jokogr, sorry for my obvious reply: providing a patch may help.

This would require some major architecture change,
including changes in pandoc-types, all readers and writers.

@brianfeister
Copy link

@jgm is there any way to achieve this with a two-step process? Compile multimarkdown to an intermediate state and then that result with pandoc?

@jgm
Copy link
Owner

jgm commented Mar 10, 2016

+++ Brian Feister [Mar 10 16 08:50 ]:

[1]@jgm is there any way to achieve this with a two-step process?
Compile multimarkdown to an intermediate state and then that result
with pandoc?

No, the problem is very simple. Pandoc's internal document
model doesn't allow colspans or rowspans. It's on the list
of things to improve.

@ickc
Copy link
Contributor

ickc commented Oct 8, 2016

Sidenote: probably this issue should be applied the label AST change.

In an attempt to make a suggestion here, I stumbled on issue #3154: pandoc "almost" has a 5th table extension: using native HTML as table. If it were true:

then after the AST changed to allow colspan and rowspan, then before we settled a syntax(es) for them, we can immediately start using it. For example, in md to LaTeX conversion, it can eats the colspan and rowspan and spills the multicolumn and multirow.

The reason for this suggestion is that settling for a syntax(es) is often tricky and requires a lot of discussions (except for possibly "mmd_colspan" since it is already there). But if it only requires the AST change (which is a prerequisite anyway for the new syntax(es) as explained) would make it easier.

@jgm
Copy link
Owner

jgm commented Oct 9, 2016

An AST change to tables would require changes in both
readers and writers. You're right that we would not
necessarily need to support a native Markdown table
syntax with rowspans and colspans right away. We
could just implement HTML tables. But we'd still need
to change ALL the writers to handle rowspans and colspans,
since these would be in the basic table model. That's
already somewhat daunting (I suppose Markdown could fall
back to HTML, but RST couldn't).

@ickc
Copy link
Contributor

ickc commented Oct 12, 2016

@jgm said,

But we'd still need
to change ALL the writers to handle rowspans and colspans,
since these would be in the basic table model.

I don't know much about the design of the AST. Can a new AST including column & row span be a superset of the current AST?

If so, the transition period can be made smoother, i.e. a gradual roll out of the feature rather than changing the AST and all the writers & readers at the same time:

  1. The AST can be changed first. Since it is the superset of the original, every reader/writer would still works
  2. The column/row span extension can be activated by a feature flag of each writer and reader, documented in a matrix like this from OpenZFS.
  3. In general, writer has higher priority to implement the feature flag than reader (except for markdown reader), since every existing reader generated a valid AST.
  4. Only when both the from-format reader & to-format writer has the feature flag, the extension is activated.

@lf-araujo
Copy link

One of the reasons this feature is important is that in scientific setting whenever one needs to compare multiple groups, some kind of subheader is needed in the table. Having that, would permanently make pandoc place firm foot within the space of scientific writing.

@jgm
Copy link
Owner

jgm commented Oct 12, 2016

+++ lf_araujo [Oct 11 16 22:34 ]:

One of the reasons this feature is important is that in scientific
setting whenever one needs to compare multiple groups, some kind of
subheader is needed in the table. Having that, would permanently make
pandoc place firm foot within the space of scientific writing.

Actually there are two issues here, right?

  1. column spans
  2. the ability to have multiple rows in a header

@jgm
Copy link
Owner

jgm commented Oct 12, 2016

No, changing the table AST would definitely require changes
to all writers and readers, immediately. The writers would
all have to know what to do when they encounter colspans.

@ickc
Copy link
Contributor

ickc commented Oct 12, 2016

I don't understand. Can we put a switch in the reader such that when a pandoc command is used, knowing the output writer do not understand colspans, then the reader do not parse colspans. It seems like somekind of switch are already being used, say when some extensions is turned off in the command line.

I can see a problem might occur if the input format is AST or JSON where the reader cannot (as least difficult to) switch off the colspan/row features. But people should know what they are doing if AST/JSON is used (and show an error message to them).

@lf-araujo
Copy link

Actually there are two issues here, right?

Yes. The ability to format subheaders would be also needed.

@jgm
Copy link
Owner

jgm commented Oct 12, 2016

+++ ickc [Oct 12 16 01:15 ]:

I don't understand. Can we put a switch in the reader such that when a
pandoc command is used, knowing the output writer do not understand
colspans, then the reader do not parse colspans. It seems like somekind
of switch are already being used, say when some extensions is turned
off in the command line.

No, readers are always independent of writers. But you're
not seeing the problem. Even if we had a switch that
dependend on the output format, the readers would still need
to be rewritten because of changes in the Pandoc type
(specifically the Table constructor).

@ickc
Copy link
Contributor

ickc commented Oct 12, 2016

I see. Then that is really a huge task. And I guess since changing AST would have compatibility issue, one didn't want to change that often, which means the strategy is probably to do serveral important AST change at once, which would make it even a bigger challenge.

And since AST change will break backward compatibility, it is safe to say it will only be in pandoc v2.0? In that case should a milestone be setup (even with no deadline), and add those important AST change to it (among other things).

@lf-araujo
Copy link

lf-araujo commented Oct 13, 2016

Thanks for the attention. I will leave two models of tables that are prevalent in papers. The first should be approachable in future iterations of pandoc, the second one, however, is a little more tricky and may not.

| Area                      |  Subjects       |      Controls   |
|---------------------------|-----------------|-----------------|
|                           |SD | se | p-value|SD | se | p-value|
|===========================|=================|=================|
| Standardised coefficients                                   |||
|===========================|=================|=================|
| Left fusiform area        | 1 | 2 | .05     | 3 | 4 |  .05    |
| Right insula              | 5 | 6 | .05     | 7 | 8 |  .05    |
| Left insula               | 5 | 6 | .05     | 7 | 8 |  .05    |
| Right fusiform area       | 1 | 2 | .05     | 3 | 4 |  .05    |
|===========================|=================|=================|
| Factor loadings                                             |||
|===========================|=================|=================|
| X                         | 1 | 2 | .05     | 3 | 4 |  .05    |
| Y                         | 5 | 6 | .05     | 7 | 8 |  .05    |
| Z                         | 5 | 6 | .05     | 7 | 8 |  .05    |

The equals were used to represent the bits that usually have lines to separate the subreader.

The second trickier table occurs when one wants to span vertically two cells. This is not essential, I am putting as an example of a common types of tables (a table for description of the population in this case).

| **Variables**                  |  **Healthy subjects (mean)** |   **Patients (mean)**   | **p-value**  |
|--------------------------------|------------------------------|-------------------------|--------------|
| **Age**                        |       11                     |        28.51            |    .01^(U)^  |
| **Gender**                                                                                          ||||                                                                     
| Male (%)                       |        12%                   |      99%                |              |
| Female (%)                     |        13%                   |         88%%            |  .99^(a)^    |                                   
| **Time from onset (days)**     |  NA                          |        111              |              |
| **Education (mean, in years)** |  10                          |     5                   |   .11 ^(U)^  |                                                      

What typically happens is a merge of the cell containing the value .99 and the cell above. That statistics concerns both Male and Female. I hope I am being clear.

@jgm
Copy link
Owner

jgm commented Nov 7, 2016

Pandoc currently allows at most one header row, which must be at the top. A rule is inserted below it in default LaTeX output.

One could try to separate conceptually between being a header cell and having a rule under, so that a cell could have one of these properties without the other. Perhaps the idea in @lf-araujo's example is that a rule of hyphens --- divides the header from the rest, while a rule of equals === indicates a rule? But do the hyphens also cause a rule to be rendered?

@welly
Copy link

welly commented Sep 25, 2019

I've not read this thread entirely but I'm currently using pandoc (2.7.2) with wkhtmltopdf and am finding that the same issue is occurring. I wondered if anyone can explain why this won't work when using wkhtmltopdf to generate the html -> pdf conversion?

Thanks very much!

@ickc
Copy link
Contributor

ickc commented Sep 25, 2019

@welly, question like this might fit pandoc-discuss better. The issue tracker is for discussing the feature request. Currently the pandoc AST doesn't have a model for this so it isn't supported in pandoc yet.

@rickywu
Copy link

rickywu commented Jan 3, 2020

Really cry for this feature.
Or any other tools to convert pandoc output of gfm to this format:

|  | 2 | 3 |
| --- | --- | --- |
| a | @cols=2: |
| b |  | test<br>ricky |
| c | @rows=2: | <br>Yes |
| e | No |

@petterreinholdtsen
Copy link

I went looking for multi-span rows and columns in pandoc, and ended up here. My interest is not with the science community, but in the writing of government standards. We are currently experimenting with writing the standard texts in markdown and rst, and use pandoc to convert them to PDF (via docbook). But the lack of rowspan support in the tables make it impossible to represent the layout of the original docx. It would thus would be great if pandoc supported spans. Sorry for not having any code to contribute, but thought it would be useful to know about yet another use case.

@glenpike
Copy link

I found this after a search about formatting / laying out tables also.
We're looking at CMS options and how to generate the following table (as an example) would be useful as we require some of our tables to be pivoted 90 degrees for simple stuff:

<table>
  <caption>Dates and amounts</caption>
  <thead>
    <tr>
      <th scope="col">Date</th>
      <th scope="col">Amount</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th scope="row">First 6 weeks</th>
      <td>£109.80 per week</td>
    </tr>
    <tr>
      <th scope="row">Next 33 weeks</th>
      <td>£109.80 per week</td>
    </tr>
    <tr>
      <th scope="row">Total estimated pay</th>
      <td>£4,282.20</td>
    </tr>
  </tbody>
</table>

The markup is based on the UK Gov's Design System: https://design-system.service.gov.uk/components/table/ - but we have a use case for these types of tables.

@ivan4th
Copy link

ivan4th commented Mar 9, 2020

A strong use case for this are some 3GPP specs that contain bit fields, e.g. https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3111
Would love to be able to view them as markdown or org in my Emacs, but all of the bitfields are off ;(

@jgm
Copy link
Owner

jgm commented Apr 3, 2020

For those who have been following this issue, we have a PR for new table types here:
jgm/pandoc-types#66
I want to avoid excessive bikeshedding on this issue, but if you have final comments, now is the time. The type allows for column and row spans, short captions, attributes, multiple header rows, footers, intermediate headers, and overriding alignments at the cell level.

@bpj
Copy link

bpj commented Apr 3, 2020

Is the Markdown syntax for these new features described in prose/with examples somewhere?

@jgm
Copy link
Owner

jgm commented Apr 3, 2020

There's no markdown syntax. That's a separate issue, and it may be quite a while before that changes. You shouldn't even assume that we will provide a markdown syntax capable of representing all these distinctions. For now the focus is just on providing types capable of representing more complex tables, which can certainly be represented in other formats.

@lrosenthol
Copy link
Sponsor Contributor

@jgm is there a timeframe for a binary release with these great improvements?? and is there a good way to contribute for input/output format handling?

@jgm
Copy link
Owner

jgm commented Apr 30, 2020

This can be closed now: we have these features in the AST.
We don't yet have support for them in readers and writers, though.
I've opened some issues for adding these to some of the most commonly used formats (you'll find them among the most recent issues).

@jgm jgm closed this as completed Apr 30, 2020
@ickc
Copy link
Contributor

ickc commented May 1, 2020

Would documentations on this AST be available? And is there's any pointers on filter frameworks to pick this up (such as pandocfilter/panflute)?

Thanks @despresc for the effort of implementing it and @jgm, @mb21 for the code reviews. This is great progress!

@jgm
Copy link
Owner

jgm commented May 1, 2020

There will be regular API docs once the new pandoc-types is released.

@reagle
Copy link

reagle commented May 15, 2020

I've opened some issues for adding these to some of the most commonly used formats (you'll find them among the most recent issues).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.