Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Included non-LaTeX code shows IJ errors // Language Injection #101

Closed
5 of 6 tasks
PHPirates opened this issue Aug 12, 2017 · 4 comments · Fixed by #1382
Closed
5 of 6 tasks

Included non-LaTeX code shows IJ errors // Language Injection #101

PHPirates opened this issue Aug 12, 2017 · 4 comments · Fixed by #1382
Assignees
Labels
enhancement New feature or (non bug related) change to the program. Epic Collection of issues. parser Issues for which significant changes in the parser are needed
Milestone

Comments

@PHPirates
Copy link
Collaborator

PHPirates commented Aug 12, 2017

I use the Listings package, but it will generate errors. For example,

\documentclass{article}
\usepackage{listings}
\begin{document}
    \begin{lstlisting}{language=mathematica}
        (* MMa comment... *)
    \end{lstlisting}
\end{document}

produces
<content> or LatexTokenType.\end expected, got '*'
on hovering over the (*.

I don't have suggestions how to detect foreign code though - but listings, minted and the verbatim environment (also the inline version) cover most use cases I guess.

List of examples to be covered by language injection

  • Parse inline verbatim content as raw text
  • Parse verbatim environment content as raw text
  • Syntax highlighting of \end in verbatim environments is wrong (what happens when LaTeX is injected in LaTeX?)
  • Language injection

.
- [ ] %(...)s string interpolation Use something like """\\begin{{equation}}{0}""".format(2) instead

#620: Python string interpolation with %

%s and %(...)s are not comments but Python

display = r"""
\begin{equation}
\text{Using product rule of }%(e1)s\text{ and }%(e2)s \\
\text{Answer: } \frac{d}{dx}%(e3)s=%(e4)s
\end{equation}
"""

display_args = {
    "e1": (x ** 3).latex(),
    "e2": (E ** x).latex(),
    "e3": f_x.latex(),
    "e4": solution.latex()
}
display_latex(display % display_args)

  • \newmintinline (similar problem as with doing \newenvironment{myenv}{\verb|}{|})
#807: Minted package with \newmintinline

\documentclass[11pt]{article}

\usepackage{minted}
\newmintinline[myphpinline]{php}{escapeinside=||}

\begin{document}

\section{One}

A statement like \myphpinline{$foo = |\alpha|} should not show an "no math content" error, as escapeinside is defined not to be \myphpinline{"$"}.

\section{Two}

Second section.

\end{document}
@stenwessel
Copy link
Member

Seems to me as an important issue. We can utilize the language injection mechanism to solve this, which gives the extra benefit that you the Mathematica code inside the listings environment gets highlighted as well (when you have the MMa support plugin installed).

@stenwessel stenwessel added the enhancement New feature or (non bug related) change to the program. label Aug 12, 2017
@stenwessel stenwessel self-assigned this Aug 12, 2017
@HannahSchellekens HannahSchellekens added this to the b0.5 milestone Aug 13, 2017
@PHPirates
Copy link
Collaborator Author

Currently this also will generate a (bit confusing) inspection warning Document does not contain a document environment

@HannahSchellekens
Copy link
Member

Currently this also will generate a (bit confusing) inspection warning Document does not contain a document environment

Because it doesn't parse correctly, it doesn't detect a document environment thus causing the error.

@stenwessel
Copy link
Member

stenwessel commented Dec 13, 2019

Thoughts about the progress so far:

This really turns out to be a non-trivial tasks for a lexer/parser system as currently present in the plugin.
Ideally, we want to 'turn off' the parser at specific regions in the source. However, it is cumbersome and practically impossible to do this completely at the lexer level.
This is because we are not able to easily 'look forward'.
Moreover, we need to hard code beforehand the tokens at which the parser switches off and on again. Since these indicators may be quite complex at the lexer level, this really complicates the lexer specification.
Also, changing these tokens and generalization to commands, environments is not easily done and would require (elaborate) changes for every type of language injection that we possibly want to support in the future.

Detecting where to switch off the parser would be much more achievable at a higher abstraction level, specifically at the parser or AST level.
However, this needs to be done in a partially constructed AST, where this information would need to be passed to the lexer. In the current model, this is not possible.

Looking forward, one way of achieving this, is to change from a lexer/parser structure to generalized parsing.
This would require migrating from grammarkit to a different parsing system, where we would probably need to manually bind to PSI AST.
Specifically, it can be useful to look into 'lake/island parsing' (see for instance the paper Generating Robust Parsers using Island Grammars by Leon Moonen (2001)).

@PHPirates PHPirates added the parser Issues for which significant changes in the parser are needed label Jan 15, 2020
@PHPirates PHPirates self-assigned this Apr 12, 2020
@PHPirates PHPirates added the Epic Collection of issues. label Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or (non bug related) change to the program. Epic Collection of issues. parser Issues for which significant changes in the parser are needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants