Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shell-escape support #708

Merged
merged 32 commits into from
Jun 3, 2021
Merged

Conversation

ralismark
Copy link
Contributor

This is a simple implementation of shell escape that runs the command in a shell. Resolves #38.

I haven't looked closely at how XeTeX handles shell escapes (e.g. how it handles quotes), so there may be differences with how some commands are interpreted. I've only tested this with a few basic commands and they seems to work.

@codecov
Copy link

codecov bot commented Jan 1, 2021

Codecov Report

Merging #708 (c3cc403) into master (1ce1e15) will increase coverage by 0.05%.
The diff coverage is 74.15%.

❗ Current head c3cc403 differs from pull request most recent head 7cf799f. Consider uploading reports for the commit 7cf799f to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #708      +/-   ##
==========================================
+ Coverage   46.79%   46.84%   +0.05%     
==========================================
  Files         142      141       -1     
  Lines       58874    58943      +69     
==========================================
+ Hits        27551    27614      +63     
- Misses      31323    31329       +6     
Impacted Files Coverage Δ
src/bin/tectonic/compile.rs 66.35% <0.00%> (-0.32%) ⬇️
src/engines/spx2html.rs 0.00% <0.00%> (ø)
src/io/mod.rs 48.57% <ø> (ø)
src/document.rs 69.06% <36.36%> (-0.83%) ⬇️
src/driver.rs 68.40% <75.54%> (+0.27%) ⬆️
crates/bridge_core/src/lib.rs 64.15% <78.57%> (+0.24%) ⬆️
crates/bridge_core/support/support.c 73.84% <100.00%> (+0.40%) ⬆️
crates/engine_xdvipdfmx/xdvipdfmx/dvipdfmx.c 57.60% <100.00%> (ø)
crates/engine_xetex/xetex/xetex-shipout.c 63.72% <100.00%> (+1.26%) ⬆️
crates/engine_xetex/xetex/xetex-xetex0.c 79.02% <100.00%> (+0.01%) ⬆️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ce1e15...7cf799f. Read the comment docs.

@pkgw
Copy link
Collaborator

pkgw commented Jan 2, 2021

Thank you! I think we can get going with this straightforward implementation.

Would you be able to try adding a few test cases to the test suite?

The other thing that I would like to do is to integrate this feature into the new Tectonic.toml framework that I am prototyping in the "V2" CLI. If you'd like to look into implementing that, I can give some pointers, but that can also be a separate PR.

@LiHRaM
Copy link

LiHRaM commented Jan 4, 2021

I just tried this with minted, using the minimal example from the docs:

Example
\documentclass{article}

\usepackage{minted}

\begin{document}
\begin{minted}{c}
int main() {
    printf("hello, world!");
    return 0;
}
\end{minted}
\end{document}

And the following command: tectonic -Z shell-escape doc.tex

And got the following output:

Output
note: this is a BETA release; ask questions and report bugs at https://tectonic.newton.cx/
Running TeX ...
warning: doc.tex:4: 
runsystem(mkdir -p _minted-doc)...
error: doc.tex:4: Package minted Error: You must invoke LaTeX with the -shell-escape flag.

See the minted package documentation for explanation.
Type  H <return>  for immediate help
error: something bad happened inside TeX; its output follows:

===============================================================================
(doc.tex
LaTeX2e <2020-02-02> patch level 5
L3 programming layer <2020-03-06> (article.cls
Document Class: article 2019/12/20 v1.4l Standard LaTeX document class
(size10.clo)) (minted.sty (keyval.sty) (kvoptions.sty (ltxcmds.sty)
(kvsetkeys.sty)) (fvextra.sty (ifthen.sty) (etoolbox.sty) (fancyvrb.sty)
(upquote.sty (textcomp.sty)) (lineno.sty)) (calc.sty) (shellesc.sty

Package shellesc Warning: Shell escape disabled on input line 73.

) (ifplatform.sty (pdftexcmds.sty (infwarerr.sty) (iftex.sty)) (catchfile.sty
(etexcmds.sty)) (ifluatex.sty)

Package ifplatform Warning: 
    shell escape is disabled, so I can only detect \ifwindows.

) (xstring.sty (xstring.tex)) (framed.sty) (float.sty))

! Package minted Error: You must invoke LaTeX with the -shell-escape flag.

See the minted package documentation for explanation.
Type  H <return>  for immediate help.
 ...                                              
                                                  
l.4 
    
No pages of output.
Transcript written on doc.log.
===============================================================================
error: the TeX engine had an unrecoverable error
caused by: halted on potentially-recoverable error as specified

No idea why, but figured you would like to know. Thanks for all the hard work!

@pkgw
Copy link
Collaborator

pkgw commented Jan 5, 2021

@LiHRaM Thanks. In shellesc.sty, it looks like \shellescape is not evaluating to the correct integer value — @ralismark, any thoughts?

As for ifplatform.sty, it seems that has an overly restrictive set of tests for checking whether shell-escape is allowed:

\ifnum\pdf@shellescape=1\relax
  \shellescapetrue
\else
  \ifluatex\else
    \PackageWarningNoLine{ifplatform}{^^J \space\space\space
      shell escape is disabled,
      so I can only detect \@backslashchar ifwindows%
    }
  \fi
\fi

By my read, this should fail on stock XeTeX as well. But the test in shellesc.sty is probably the more important one.

@ralismark
Copy link
Contributor Author

ralismark commented Jan 6, 2021

I've added some changes to make \shellescape and \iseof18 have the right values - both seem to be used in checking if shell escape is enabled. With these changes, commands are run for the minted example, but you encounter an error.

Output
note: this is a BETA release; ask questions and report bugs at https://tectonic.newton.cx/
Running TeX ...
warning: lineno.sty:296: Invalid UTF-8 byte or sequence at line 296 replaced by U+FFFD.
warning: ifplatform.sty:91:
runsystem(uname -s > "doc.w18")...
warning: ifplatform.sty:93:
runsystem(rm -- "doc.w18")...
warning: doc.tex:4:
runsystem(mkdir -p _minted-doc)...
/usr/bin/pygmentize
warning: doc.tex:5:
runsystem(which pygmentize && touch doc.aex)...
warning: doc.tex:5: runsystem(rm doc.aex)...
Error: cannot read infile: [Errno 2] No such file or directory: 'doc.pyg'
warning: command finished with exit code 1
warning: doc.tex:11:
runsystem(pygmentize -l c -f latex -P commandprefix=PYG -F tokenmerge -o _minte
d-doc/E1668299E92BCACB92F5B58DDFF5261109195D610AAA07416F31E185CD5C85A3.pygtex d
oc.pyg)...
error: doc.tex:11: Package minted Error: Missing Pygments output; \inputminted was
probably given a file that does not exist--otherwise, you may need
the outputdir package option, or may be using an incompatible build tool,
or may be using frozencache with a missing file.

See the minted package documentation for explanation.
Type  H <return>  for immediate help
error: something bad happened inside TeX; its output follows:

===============================================================================
(doc.tex
LaTeX2e <2018-04-01> patch level 4
Babel <3.20> and hyphenation patterns for 84 language(s) loaded.
(article.cls
Document Class: article 2014/09/29 v1.4h Standard LaTeX document class
(size10.clo)) (minted.sty (keyval.sty) (kvoptions.sty (ltxcmds.sty)
(kvsetkeys.sty (infwarerr.sty) (etexcmds.sty (ifluatex.sty)))) (fvextra.sty
(ifthen.sty) (etoolbox.sty) (fancyvrb.sty
Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix
<2008/02/07> (tvz)) (upquote.sty (textcomp.sty (ts1enc.def))) (lineno.sty))
(calc.sty) (shellesc.sty) (ifplatform.sty (pdftexcmds.sty (ifpdf.sty))
(catchfile.sty) (doc.w18)) (xstring.sty (xstring.tex)) (framed.sty) (float.sty)
) (xcolor.sty (color.cfg) (xetex.def))
No file doc.aux.
(ts1cmr.fd) (_minted-doc/default-pyg-prefix.pygstyle)
(_minted-doc/default.pygstyle)

! Package minted Error: Missing Pygments output; \inputminted was
probably given a file that does not exist--otherwise, you may need
the outputdir package option, or may be using an incompatible build tool,
or may be using frozencache with a missing file.

See the minted package documentation for explanation.
Type  H <return>  for immediate help.
 ...

l.11 \end{minted}

No pages of output.
Transcript written on doc.log.
===============================================================================
error: the TeX engine had an unrecoverable error
caused by: halted on potentially-recoverable error as specified

It seems like doc.pyg is not being written to disk. I'm suspecting this is because tectonic doesn't actually write tempfiles to disk - you can test this with this:

tempfile.tex
\documentclass{article}

\newwrite\tempfile

\begin{document}

\immediate\openout\tempfile=lists.tex
\immediate\write\tempfile{this is interesting}
\immediate\write\tempfile{}
\immediate\write\tempfile{this too}
\immediate\closeout\tempfile

\input{lists}

\end{document}

XeTeX creates lists.tex while tectonic doesn't, even though both produce the same output pdf.

@LiHRaM
Copy link

LiHRaM commented Jan 7, 2021

That's a good observation. I just tested with the --keep-intermediates flag, and I can get the tempfiles example to work, but minted still fails.

@pkgw
Copy link
Collaborator

pkgw commented Jan 7, 2021

Hmmm, yeah. I think we'll have to add special hooks to write tempfiles to disk before any shell-escape commands run, because the intermediates are only written at the end of a compilation run, not in the middle when shell-escape tools need them.

@pkgw
Copy link
Collaborator

pkgw commented Jan 9, 2021

@ralismark FYI I'm about to merge #716 and follow up with more work to rearrange the Tectonic codebase into a more modular structure. As it stands I don't think it will break your PR, but there may be some changes that will require you to rebase your work. Apologies for the inconvenience, but I think I have a window to make a lot of progress on this front and I want to push ahead as rapidly as I can while that's the case.

@BelKed
Copy link

BelKed commented Feb 21, 2021

Is there any chance to be merged? I really need shell-escape support for minted package...

@breuerfelix
Copy link

breuerfelix commented Feb 22, 2021

Already posted in the Issue but for everybody with some deadlines, here you go:

Since i cannot wait until the PR is merged (got some deadlines over here) and tectonic is by far the smoothest LaTeX compiler, i created a small helper script to use minted with tectonic with a tutorial until it is built in.

https://breuer.dev/tutorial/tectonic-minted

Just comment on the post if something is not working properly and i will try to fix it :)

@ralismark
Copy link
Contributor Author

ralismark commented Apr 11, 2021

Sorry about the massive time gap, I've rebased this branch to master now. I've added a write_to_disk function to IoProvider to flush tempfiles. I'm not doing any special checks in MemoryIo's implementation, so all tempfiles are getting written.

I've tested this with the minted example above and it looks like it works.

Edit: It also seems like I can't see the codecov details? https://app.codecov.io/gh/tectonic-typesetting/tectonic/compare/708 is giving me

Unauthorized
Github API: Forbidden Check on Codecov's status or see our docs for common support.
Error 403

@pkgw
Copy link
Collaborator

pkgw commented Apr 13, 2021

Thank you @ralismark for your patience and putting up with the hassles of trying to hit a moving target with all of my API changes! I really appreciate it, and I'm sure that the number of people that have been asking for this feature for a long time do as well.

The current build failure is a small Clippy complaint that's easy to fix; for reference, you can run the Clippy check with:

cargo clippy --all --all-targets --all-features -- --deny warnings

I took a look at adding support for this feature into the "V2" CLI and discovered an issue that I think we'll need to address. Right now, the write_to_disk function flushes out the temporary files in the current directory, and launches its programs in the current directory as well. This is "unhygienic" on the filesystem, and while I think that's a significant issue the really big problem is that it leads to build errors if the filesystem I/O root is not the current directory and the TeX code wants to then read a file that was created by a \write18 command. In a basic one-shot compilation everything will be OK, but for the V2 interface or cases where the input document is not in the current directory, this issue will arise.

On the implementation side, I'm a little uncomfortable with the write_to_disk implementation having most of its "meat" in the MemoryIo layer. That impl shouldn't know anything about filesystems or paths. I feel like that kind of code belongs in the ProcessingSession type, where the write_files method already is. Of course the implementation challenge is that the core I/O bridge interfaces don't "know about" that type.

Unfortunately I don't see a super-easy solution. In the V2 model we shouldn't be modifying in the src/ directory (really, we should be able to run a build even if that directory is read-only), so the intermediate files will have to be written into the output directory, or potentially some kind of temporary directory. Which means that some kind of infrastructure will have to be added so that the types exposed to the core bridge layer "know" what that directory is. The least-bad option that I'm thinking of right now is to start storing the output_path information to the IoEvents type, and move the write_to_disk functionality there rather than attaching it to IoProvider. And there would need to be some API for the runsystem code to know which directory the system code should be run from, as well. Finally, the V2 mode would need to add a filesystem I/O layer to read files from the final output directory as well as the source directory. That all feels super dirty, but I think it would at least tackle filesystem hygiene and enable the TeX engine to read the files created by any \write18 commands.

I apologize if this is a bit unclear — it's late. Maybe I'll get some inspiration in the morning.

@ralismark
Copy link
Contributor Author

Yeah passing the necessary things to the core bridge seems like a big hassle. I agree with having \write18 and temporary files be put in the output directory, which means that we'd need to somehow pass output_dir to CoreBridgeState. What about adding a struct to CoreBridgeState/CoreBridgeLauncher that contains variables like this? That would also make it easier to pass other information to the bridge in the future. Another way I thought of would be to pass the output dir to runsystem from inside the tex engine when you call it, but I'm not sure if it's even available from the C side.

For write_to_disk, putting it in IoEvents seems off to me, since it looks like that trait only listens to events, not actually fulfilling the IO requests. Additionally, that trait doesn't have access to file contents either -- it's just the IoProvider that does. I guess a third option would be to add a function to OutputHandle (well, requiring an additional function on inner) to write the file to a given path, but this isn't that nice either.

Also, could there be problems with commands that normally try to read from files in the same directory as the input file?

As an additional note, regular XeTeX doesn't work for either of the examples in the above comments if you set an output directory -- it complains about trying to read from missing files -- but that should resolved by adding an extra filesystem layer.

@ralismark
Copy link
Contributor Author

ralismark commented Apr 23, 2021

I thought about this further, and requiring the working dir to be passed to runsystem might be better. Even though xetex currently doesn't know about the output dir, we can pass it in via tt_xetex_set_string_variable, and change TexEngine::shell_escape to also take the working directory. This allows us to only change code relevant to shell_escape, and not CoreBridgeState/CoreBridgeLauncher or other engines. Another slight benefit of this is that we can have the output dir and the shell-escape working dir be separate, if we decide to support that later on.

@pkgw
Copy link
Collaborator

pkgw commented May 23, 2021

I just wanted to check in and say that I haven't forgotten about this — I had a major work deadline that took up all of my time until last week. Part of that effort happened to involve using Tectonic in ways very relevant to this PR so it's definitely been on my mind! There are still some aspects of the design here that I want to think about carefully, since these details are going to matter a lot for the overall coherence of the program design.

pkgw added 16 commits May 25, 2021 18:37
Instead, use the new IoSetup implementation of IoProvider. This helps us
avoid some additional borrows that limit our flexibility. We do this
with some aggressive use of local macros to de-boilerplate the
implementation.
BREAKING: replace the IoEventBackend trait with a DriverHooks trait that
is similar but has some important differences. In particular,
DriverHooks is a single reference that provides combined access to the
functionality that was previously split between the IoProvider and the
IoEventBackend. This unification allows operations that previously
wouldn't have been possible because they'd have required simultaneous
access to both of these.

Philosophically, the DriverHooks trait better captures its purpose:
gathering all of the callbacks needed to help the C/C++ engines
interface with the outside world. IoProvider functionality is *most*,
but not all of this.

Furthermore, in the DriverHooks paradigm, a lot of the events in
IoEventBackend become superfluous. The only ones that can't be captured
with a custom IoProvider implementation are the notifications for when
input or output handles are closed.
Port the main crate to use the new DriverHooks interface. The main
difference here is that IoSetup and IoEvents are merged into a single
BridgeState struct, which becomes a "wholly owned subsidiary" of the
ProcessingSession type.

The core insight here is that what we need to do is collect all of the
session functionality that has to be accessed by the C/C++ engines --
the DriverHooks implementation, essentially -- in one struct, which is a
subset of the overall ProcessingSession state. When a C/C++ engine is
running, it maintains a mutable reference to that *subset* of state, but
it makes our life a lot easier if we can isolate that bit of the state
from the overall ProcessingSession struct. Now, this subset must include
the ProcessingSession's whole I/O stack, so it's a non-trivial subset,
but overall the boundaries of the DriverHooks trait help us isolate what
has to go in that subset and what can stay separate.
BREAKING: Here we change shell-escape to be implemented by invoking a
new driver hook. This addition on its own need not be a breaking change,
and indeed the DriverHooks trait is designed to allow this to be the
case. However, since we've already accumulated some breaks, we go ahead
and breaks the related C/C++ API and also the unrelated API of the
file-closed event APIs, which really ought to take the status backend as
a parameter in the same way that virtually all of the other I/O APIs do.
This removes the new API because it will no longer be needed in the
updated implementation of shell-escape. This would be a breaking change,
but the new API has never been released.
This goes back to the older boolean shell-escape API, which is all we
need with the iterated architecture. We also track the C/C++ API change
that was just made in bridge_core. Also we turn off diagnostic printouts
when shell-escape is enabled, since the Tectonic driver can give much
nicer diagnostics about what's happening there.

This would be a breaking change, but the previous API was never
released.
Gather up the previous API revisions and use them to provide a new
shell-escape implementation. This one happens inside the driver's
BridgeState struct, which has access to the detailed driver I/O and so
doesn't need to muck around with the internals of the MemoryIo manually.
@pkgw
Copy link
Collaborator

pkgw commented Jun 3, 2021

It is done.

... except that I need to fix the arm-musl build: tectonic-typesetting/tectonic-ci-support#18

@pkgw pkgw closed this Jun 3, 2021
@pkgw pkgw reopened this Jun 3, 2021
@pkgw pkgw merged commit f632eac into tectonic-typesetting:master Jun 3, 2021
@ralismark ralismark deleted the shell-escape branch August 20, 2021 01:12
@Neved4 Neved4 mentioned this pull request Oct 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shell-escape
5 participants