Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame could have a to_markdown method. #11052

Closed
petebachant opened this issue Sep 10, 2015 · 17 comments · Fixed by #30350
Closed

DataFrame could have a to_markdown method. #11052

petebachant opened this issue Sep 10, 2015 · 17 comments · Fixed by #30350
Labels
IO HTML read_html, to_html, Styler.apply, Styler.applymap Output-Formatting __repr__ of pandas objects, to_string
Milestone

Comments

@petebachant
Copy link

Similar to to_latex and to_html.

@TomAugspurger
Copy link
Contributor

Is there a widely agreed upon format for markdown tables? It's not in Gruber's original version, and IIRC CommonMark even punted on them.

@petebachant
Copy link
Author

Good point. I'm not sure. Maybe the method could specify a flavor, e.g., GitHub or Pandoc?

@TomAugspurger
Copy link
Contributor

I suspect that anything too complicated won't find much support here (maintaining stuff is no fun). Especially now that we have pipe. df.pipe(to_markdown) isn't much worse than df.to_markdown.

Is your usecase here to present / read the mardown, or convert it to something else? Since markdown is a superset of HTML, you may be able to get away with to_html before converting.

@petebachant
Copy link
Author

My specific use case would be to create a DataFrame, copy/paste to GitHub flavored Markdown document (or GitHub issues, PRs, wikis, etc.), but still be able to read/edit it later without the HTML mess.

As an example, below is a table I copied from a Jupyter Qt console. First I printed the DataFrame to the terminal, copied/pasted here, then entered the dashes and pipes manually for the GH Markdown. Then I generated HTML with the to_html method, which doesn't render quite right here.

GitHub Markdown

Source

    | speed | Re_tip | Re_root | Re_ave | Re_D
---|--------|-------|--------|--------|--------
0 | 4.0e-01 | 5.0e+04 | 8.3e+04 | 6.6e+04 | 4.3e+05
1 | 6.0e-01 | 7.4e+04 |  1.2e+05 | 9.9e+04 | 6.4e+05
2 | 8.0e-01 | 9.9e+04 | 1.7e+05 | 1.3e+05 | 8.6e+05
3 | 1.0e+00 | 1.2e+05 |  2.1e+05 | 1.7e+05 | 1.1e+06
4 | 1.2e+00 | 1.5e+05 |  2.5e+05 | 2.0e+05 | 1.3e+0

Results

speed Re_tip Re_root Re_ave Re_D
0 4.0e-01 5.0e+04 8.3e+04 6.6e+04 4.3e+05
1 6.0e-01 7.4e+04 1.2e+05 9.9e+04 6.4e+05
2 8.0e-01 9.9e+04 1.7e+05 1.3e+05 8.6e+05
3 1.0e+00 1.2e+05 2.1e+05 1.7e+05 1.1e+06
4 1.2e+00 1.5e+05 2.5e+05 2.0e+05 1.3e+0

Pandas HTML

Source

<table border="1" class="dataframe">\n  <thead>\n    <tr style="text-align: right;">\n      <th></th>\n      <th>speed</th>\n      <th>Re_tip</th>\n      <th>Re_root</th>\n      <th>Re_ave</th>\n      <th>Re_D</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>4.0e-01</td>\n      <td>5.0e+04</td>\n      <td>8.3e+04</td>\n      <td>6.6e+04</td>\n      <td>4.3e+05</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>6.0e-01</td>\n      <td>7.4e+04</td>\n      <td>1.2e+05</td>\n      <td>9.9e+04</td>\n      <td>6.4e+05</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>8.0e-01</td>\n      <td>9.9e+04</td>\n      <td>1.7e+05</td>\n      <td>1.3e+05</td>\n      <td>8.6e+05</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>1.0e+00</td>\n      <td>1.2e+05</td>\n      <td>2.1e+05</td>\n      <td>1.7e+05</td>\n      <td>1.1e+06</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>1.2e+00</td>\n      <td>1.5e+05</td>\n      <td>2.5e+05</td>\n      <td>2.0e+05</td>\n      <td>1.3e+06</td>\n    </tr>\n  </tbody>\n</table

Results

\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
speedRe_tipRe_rootRe_aveRe_D
04.0e-015.0e+048.3e+046.6e+044.3e+05
16.0e-017.4e+041.2e+059.9e+046.4e+05
28.0e-019.9e+041.7e+051.3e+058.6e+05
31.0e+001.2e+052.1e+051.7e+051.1e+06
41.2e+001.5e+052.5e+052.0e+051.3e+06

@hayd
Copy link
Contributor

hayd commented Sep 11, 2015

One issue is what should the row headers be for MI columns/index. It seems that GH registers only the first (unless there is some syntactical trick).

For most flavors you can include html.

@jankatins
Copy link
Contributor

As the new styler uses jinja (#10250), this shouldn't be too hard? I would also love this feature for knitpy, which is a markdown based format which is converted into all kind of formats (docx...pdf...html).

Up to now (and as workarounds...), I recommended tabulate to convert a DataFrame to markdown. There is also pandoc (e.g. via pypandoc), which can take the output of df.to_html() and convert that to markdown.

@jreback
Copy link
Contributor

jreback commented Nov 11, 2015

@JanSchulz yes, I could see .to_markdown() method in Styler (as better API than to put it directly in DataFrame (though could have that as well).

@jreback jreback added Output-Formatting __repr__ of pandas objects, to_string IO HTML read_html, to_html, Styler.apply, Styler.applymap labels Nov 11, 2015
@jreback jreback added this to the Someday milestone Nov 11, 2015
@jankatins
Copy link
Contributor

IMO, on .styler it doesn't make sense, markdown unfortunately do not provide styles :-( I only mentioned .styler as it already (soft) requires jinja and that should make it easy to build a markdown representation...

IMO it should be DataFrame.to_markdown() and DataFrame._repr_markdown_()

@ghost
Copy link

ghost commented Nov 27, 2015

I'd love DataFrame.to_markdown() too. I was surprised when it didn't work already.

jreback pushed a commit that referenced this issue Jan 14, 2016
This is a straightforward port of GH#10070 to 1d arrays.
@pstjohn
Copy link

pstjohn commented Oct 11, 2017

Would pandas be open to adding a dependency? tabulate does exactly this and is pip installable.

from tabulate import tabulate
print(tabulate(df, headers='keys', tablefmt='pipe'))
|    |   test1 | test2   |   test3 |   test4 |    test5 |
|---:|--------:|:--------|--------:|--------:|---------:|
|  0 |     385 | apple   |     288 |     745 |  64.9352 |
|  1 |     627 | banana  |       3 |     792 | 226.955  |
|  2 |     486 | pear    |     446 |     503 | 110.454  |
|  3 |     368 | orange  |     887 |     808 | 297.62   |
|  4 |     550 | grape   |     235 |      96 | 240.324  |
|  5 |     749 | peach   |      22 |     598 | 240.642  |
test1 test2 test3 test4 test5
0 385 apple 288 745 64.9352
1 627 banana 3 792 226.955
2 486 pear 446 503 110.454
3 368 orange 887 808 297.62
4 550 grape 235 96 240.324
5 749 peach 22 598 240.642

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Oct 11, 2017 via email

@jreback
Copy link
Contributor

jreback commented Oct 11, 2017

i think adding a to_markdown method which calls tabulate as a dependency would be ok

@aflaxman
Copy link
Contributor

aflaxman commented Nov 7, 2017

print(tabulate(df, headers='keys', tablefmt='pipe')) is not as pretty as pandas for multi-index, would be cool if whatever pandas lands on is.

@kangwonlee
Copy link

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Dec 12, 2019

I'm +0 to adding it as an optional dependency, and adding a to_markdown method. In the meantime, df.pipe(tabulate, header='keys', tablefmt='pipe') is a workaround. I suppose it would be nice to avoid typing those keyword arguments every time :)

@TomAugspurger Is this all the method would have to do? If so, I'd be happy to work on a PR

@jreback
Copy link
Contributor

jreback commented Dec 12, 2019

i think we would likely accept a PR for something like the above

@MarcoGorelli
Copy link
Member

Sure, I've submitted a simple PR.

What should this method return if there's a wide DataFrame? (or is this an enhancement that would get taken care of at a later stage?)

@jreback jreback modified the milestones: Someday, 1.0 Dec 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO HTML read_html, to_html, Styler.apply, Styler.applymap Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants