Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vignette render with markdown rather than rmarkdown #5773

Merged
merged 2 commits into from
Dec 2, 2023
Merged

Conversation

jangorecki
Copy link
Member

@jangorecki jangorecki commented Nov 29, 2023

This PR is addressing #5745, the most beneficial part of it, replacing rmarkdown with markdown for generating vignettes.

Why it is important?

  • reduced installation time:
    installation time takes 35 seconds vs 12 minutes
  • reduced risk of dependency breaking and debugging of that:
    installing 7 packages vs 31 packages
  • reduced OS requirements to build data.table package
    no need C++ compiler anymore, build environment can be more lightweight

Note that we don't want to cache installed packages because we risk breaking changes can sneak unnoticed.
Reduced installation time could be mitigated by installing linux binaries, 1. if available for debian, and 2. if built in publicly auditable workflows. Considering 30s installation IMO it is not necessary.

Change affects how vignettes are being rendered. Personally I find markdown style nicer, as font is slightly bigger and the text is more wide filling up empty spaces on left and right side. What I also like is that plots are much more readable using markdown. rmarkdown on the other hand has frame around TOC. Maybe that could tuned if necessary? @yihui
more info on differences can be found here: https://yihui.org/en/2023/10/markdown-complete/

So saving 12 minutes on each test job means saving a lot of CI compute minutes. We have 8 test jobs currently, and likely we will have more. There is also build job which needs to install those. So savings of CI compute minutes are more than 100 min on a single pipeline. Aside from time we can also use lighter image for build (no need for C++ toolchain).

Below is the rendered vignettes with rmarkdown and markdown, so anyone can have a look. Especially author of each vignettes is invited to have a look at its own.

title rmarkdown markdown
benchmarking rmarkdown markdown
faq rmarkdown markdown
importing rmarkdown markdown
intro rmarkdown markdown
keys-fast-subset rmarkdown markdown
programming rmarkdown markdown
reference-semantics rmarkdown markdown
reshape rmarkdown markdown
sd-usage rmarkdown markdown
secondary-indices-and-auto-indexing rmarkdown markdown

Copy link

codecov bot commented Nov 29, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (ba2f26b) 97.46% compared to head (222e5d5) 97.46%.
Report is 5 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5773   +/-   ##
=======================================
  Coverage   97.46%   97.46%           
=======================================
  Files          80       80           
  Lines       14814    14814           
=======================================
  Hits        14439    14439           
  Misses        375      375           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@yihui yihui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! This PR should close rstudio/markdown#108.

Yet another benefit is that the package size will be reduced:

markdown rmarkdown ratio
datatable-benchmarking.html 13328 24942 53%
datatable-faq.html 92003 131380 70%
datatable-importing.html 22427 34611 65%
datatable-intro.html 57219 104933 55%
datatable-keys-fast-subset.html 42073 77847 54%
datatable-programming.html 30604 70292 44%
datatable-reference-semantics.html 25113 45978 55%
datatable-reshape.html 25142 46981 54%
datatable-sd-usage.html 127466 112031 114%
datatable-secondary-indices-and-auto-indexing.html 23564 44123 53%
total 458939 693118 66%

Comment on lines +5 to +8
markdown::html_format:
options:
toc: true
number_sections: true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to style the TOC similarly to rmarkdown, you can provide the styles in an extra.css file, e.g.,

#TOC {
  border: 1px solid #ccc;
  border-radius: 5px;
  padding-left: 1em;
  background: #f6f6f6;
}

Then include this extra.css via:

Suggested change
markdown::html_format:
options:
toc: true
number_sections: true
markdown::html_format:
options:
toc: true
number_sections: true
meta:
css: [default, extra.css]

@tdhock
Copy link
Member

tdhock commented Nov 29, 2023

there seems to be an issue with the reshape vignette, but easy to fix (just remove those bs-callout etc formatting codes)
image

@MichaelChirico
Copy link
Member

MichaelChirico commented Nov 29, 2023

An impressively small diff for such an impact, nice! LGTM.

For sd-usage, the plots look much better with {markdown}. That probably explains why the vignette size increased -- probably we could fine-tune the plot to look good but be a bit smaller.

@jangorecki jangorecki added this to the 1.15.0 milestone Dec 2, 2023
@jangorecki
Copy link
Member Author

Updates to vignettes will follow up in a new PR to keep this change simple

@jangorecki jangorecki merged commit 860b22e into master Dec 2, 2023
4 checks passed
@jangorecki jangorecki deleted the markdown branch December 2, 2023 20:16
jangorecki added a commit that referenced this pull request Dec 2, 2023
jangorecki added a commit that referenced this pull request Dec 3, 2023
…5784)

* do not use options max.print in vignettes, closes #5783

* follow up of markdown vign enginge #5773

* amend feedback from Michael
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants