Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not check content enclosed by dollar signs in markdown #210

Closed
we-taper opened this issue Jan 3, 2021 · 11 comments
Closed

Do not check content enclosed by dollar signs in markdown #210

we-taper opened this issue Jan 3, 2021 · 11 comments
Assignees
Labels
1-feature-request ✨ Issue type: Request for a desirable, nice-to-have feature 3-fixed Issue resolution: Issue has been fixed on the develop branch
Milestone

Comments

@we-taper
Copy link

we-taper commented Jan 3, 2021

Is your feature request related to a problem? Please describe.

Currently, in many favors of markdown (e.g. pandoc, jupyter, mathoverflow), contents surrounded by single or double dollar signs are treated as latex math commands. But currently ltex treats them as normal text and checks them for grammar errors (including marking $$ as a grammar error).

Example:

# Content

$$
\langle 1 \rangle
$$

Here, $$, rangle, and langle are marked as possible spelling mistake, when in fact they are latex commands.

Describe the solution you'd like

Either ignore checking the grammar for the content matching:

  • single dollar sign: regex \$([^\$]+)\$
  • double dollar sign: regex \$\$([^\$]+)\$\$

Or treat them as tex math environments.

Describe alternatives you've considered

At the moment, adding the following lines in the ltex.hiddenFalsePositives.en-US.txt reduces the amount of spelling mistakes reported by ltex:

{"rule":"MORFOLOGIK_RULE_EN_US","sentence":"\\$\\$([^\\$]+)\\$\\$"}
{"rule":"MORFOLOGIK_RULE_EN_US","sentence":"\\$([^\\$]+)\\$"}

Additional context

None

@we-taper we-taper added the 1-feature-request ✨ Issue type: Request for a desirable, nice-to-have feature label Jan 3, 2021
@valentjn
Copy link
Owner

valentjn commented Jan 3, 2021

Thanks for the feature request. I'm not sure the regexes you provide are suitable. What about The pen costs $2, but the apple costs $1.? Plus, can I have a literal dollar sign (maybe escaped somehow) in a math context?

How do the parsers you mention actually handle this?

@we-taper
Copy link
Author

we-taper commented Jan 3, 2021

Thanks for the reply. That is a very interesting question that I haven't thought about. I did a bit of research and the situation is a bit messy.

In pandoc, it is described that:

Anything between two $ characters will be treated as TeX math. The opening $ must have a non-space character immediately to its right, while the closing $ must have a non-space character immediately to its left, and must not be followed immediately by a digit. Thus, $20,000 and $30,000 won’t parse as math. If for some reason you need to enclose text in literal $ characters, backslash-escape them and they won’t be treated as math delimiters.

For display math, use $$ delimiters. (In this case, the delimiters may be separated from the formula by whitespace.

For both jupyter notebook, and MathOverflow, they incorrectly parsed your example into mathematical equations.
However, jupyter notebook supports exporting to formats other than HTML, which uses pandoc for conversion. So I guess following pandoc's standard would be most useful?

valentjn added a commit to valentjn/ltex-ls that referenced this issue Jan 13, 2021
@valentjn
Copy link
Owner

I discovered that LTEX's Markdown parser Flexmark supports a long list of extensions. One of them adds support for math. The caveat is that the syntax is that of GitLab Flavored Markdown, so $`E = mc^2`$ for inline math and

```math
a^2 + b^2 = c^2
```

for display math. Would that still be acceptable for you? I enabled the extension in LTEX, so this would already be supported in the next release, while writing a custom parser/extension would take more time.

@valentjn valentjn added the 2-needs-info Issue status: We need more information (usually) from the submitter before continuing label Jan 15, 2021
@we-taper
Copy link
Author

Thanks for that update. Unfortunately, I have almost all equations in double dollar (non-inline) math environments and the fenced math code block isn't supported by my markdown previewer (Markdown All in One) and post-processor (pandoc).

But I agree there is a dependency on the upstream and I might (if I have time) raise issues there in Flexmark for an extension.
At the moment I can fiddle with the ltex.hiddenFalsePositives settings for a temporary solution.

@valentjn valentjn removed the 2-needs-info Issue status: We need more information (usually) from the submitter before continuing label Jan 18, 2021
@valentjn
Copy link
Owner

FWIW, it's also possible to temporarily disable LTEX with magic comments. Of course, putting magic comments everywhere is not a satisfying solution.

I got annoyed by the lack of this feature myself today. I'll check how hard this is to implement. This depends on when we can parse stuff, before or after Flexmark splits the code into paragraphs/an AST. "After" would probably be worse, because the closing $$ can be in a different paragraph than the opening $$.

@valentjn
Copy link
Owner

Turns out Flexmark can do both by allowing extensions to be block parsers or inline parsers, amongst others.

A rudimentary version of this feature is now implemented. It's not nearly as powerful as LTEX's processing of LATEX files (regarding punctuation, dummy words with vowels, etc.), but it should be a good start.

@valentjn valentjn added the 3-fixed Issue resolution: Issue has been fixed on the develop branch label Jan 27, 2021
@valentjn
Copy link
Owner

Feature released in 8.4.0.

@we-taper
Copy link
Author

we-taper commented Feb 1, 2021

Thanks. I just tested it and it works as expected! 👍

@universemaster
Copy link

Turns out Flexmark can do both by allowing extensions to be block parsers or inline parsers, amongst others.

A rudimentary version of this feature is now implemented. It's not nearly as powerful as LTEX's processing of LATEX files (regarding punctuation, dummy words with vowels, etc.), but it should be a good start.

Do you have a road map to add these additional features? I am currently getting a lot of

  • "Unpaired symbol: )" and
  • "If/So at the beginning of a sentence usually requires a 2nd clause."

type errors because I have dollars in between. Should I open a new issue?

@valentjn
Copy link
Owner

@universemaster You shouldn't be getting these errors, if I understand it correctly. What I meant with “punctuation” is that currently, LTEX won't recognize if a displayed formula ends with a full stop (i.e., the period is inside the displayed formula) as it does in LATEX mode. What I meant with “dummy words with vowels” is that currently, LTEX won't recognize that a formula starts with a vowel when spoken (e.g., an $n$-dimensional space will give a false positive, because LTEX doesn't see the formula is spoken as “enn,” which starts with a vowel), which it does in LATEX mode. I wouldn't expect neither of your error messages for these things. So please open a new issue if you think this is due to wrong parsing of dollar signs.

@mateosss
Copy link

mateosss commented Jan 20, 2022

While math enclosed by $$ is effectively not being handled. Inline math enclosed by $ is still reporting spelling mistakes. Is this expected?

EDIT: It seems to get fixed when using <!-- LTeX: language=en-US --> but I'm writing a document with language es-AR. Any recommendations? The reported error is MORFOLOGIK_RULE_ES.

me-johnomar added a commit to me-johnomar/ltex-ls that referenced this issue Jan 31, 2024
me-johnomar added a commit to me-johnomar/ltex-ls that referenced this issue Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1-feature-request ✨ Issue type: Request for a desirable, nice-to-have feature 3-fixed Issue resolution: Issue has been fixed on the develop branch
Projects
None yet
Development

No branches or pull requests

4 participants