Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibility between links wrapped in asterisks, pedantic, and GFM #1410

Closed
simov opened this issue Jan 14, 2019 · 14 comments
Closed

Incompatibility between links wrapped in asterisks, pedantic, and GFM #1410

simov opened this issue Jan 14, 2019 · 14 comments
Labels
category: inline elements category: mixed content L2 - annoying Similar to L1 - broken but there is a known workaround available for the issue

Comments

@simov
Copy link

simov commented Jan 14, 2019

Describe the bug

I have a named link, that I want wrapped in asterisks, essentially making it italic.

To Reproduce

- italic
  - [*named link*][some-url]
  - *[named link][some-url]*
  - [_named link_][some-url]
  - _[named link][some-url]_
- bold
  - [**named link**][some-url]
  - **[named link][some-url]**
  - [__named link__][some-url]
  - __[named link][some-url]__
- bold italic
  - [***named link***][some-url]
  - ***[named link][some-url]***
  - [___named link___][some-url]
  - ___[named link][some-url]___
  - [*__named link__*][some-url]
  - [__*named link*__][some-url]
  - __*[named link][some-url]*__
- code
  - [`named link`][some-url]
- code italic
  - *[`named link`][some-url]*
  - [*`named link`*][some-url]
  - _[`named link`][some-url]_
  - [_`named link`_][some-url]
- code bold
  - **[`named link`][some-url]**
  - [**`named link`**][some-url]
  - __[`named link`][some-url]__
  - [__`named link`__][some-url]
- code bold italic
  - [***`named link`***][some-url]
  - ***[`named link`][some-url]***
  - [___`named link`___][some-url]
  - ___[`named link`][some-url]___
  - [*__`named link`__*][some-url]
  - [__*`named link`*__][some-url]
  - __*[`named link`][some-url]*__

The offending lines are:

- *[named link][some-url]*
- ***[named link][some-url]***
- __*[named link][some-url]*__
- *[`named link`][some-url]*
- ***[`named link`][some-url]***
- __*[`named link`][some-url]*__

Enabling the pedantic option fixes this issue, but at the same time it breaks lots of other stuff, like GFM tables and strikethrough. Also note that I only tested with named links, but it may be broken for other types of links as well.

Expected behavior

Well, I'm looking at the exact same example using Remark - another popular parser, that I think was forked out of Marked back then, and everything renders correctly.

Let me know if I need to provider more information.

@styfle
Copy link
Member

styfle commented Jan 14, 2019

Thanks this looks like a bug.

Compare the output from marked with the output from commonmark.

The workaround is to use <em> or <strong> instead of asterisks.

@styfle styfle added the L2 - annoying Similar to L1 - broken but there is a known workaround available for the issue label Jan 14, 2019
@simov simov changed the title Incompatibility between links wrapped in asteriscs, pedantic, and GFM Incompatibility between links wrapped in asterisks, pedantic, and GFM Jan 14, 2019
@simov
Copy link
Author

simov commented Jan 15, 2019

@styfle I think wrapping up your links with asterisks as pretty common, because that's the only way the editors can apply the correct style to that link.

I only use HTML tags in my markdown documents as a last resort.

@styfle
Copy link
Member

styfle commented Jan 15, 2019

Agreed, this is a bug 👍

Would you like to submit a PR to fix it? 😃

@simov
Copy link
Author

simov commented Jan 15, 2019

Well, you got me! I'm not familiar with this library, but I have a browser extension that can be used as a constant source of annoyances for you. 😄

@UziTech
Copy link
Member

UziTech commented Mar 11, 2019

related to #1284

@x13machine
Copy link
Contributor

It appears to be something wrong with this regex https://github.com/markedjs/marked/blob/master/lib/marked.js#L545 I'll see if I can fix it today.

@UziTech
Copy link
Member

UziTech commented Mar 13, 2019

@x13machine The problem is that an em can't end inside a link so it will be difficult to find an em with just a regular expression.

so *[link*](url)* should be

<p><em><a href="url">link*</a></em></p>

not

<p><em>[link</em>](url)*</p>

but *[link*](url* should be

<p><em>[link</em>](url*</p>

so after finding an ending * or _ you will have to figure out if it is in a valid link, and if it is find the next ending * or _

@x13machine
Copy link
Contributor

so the regex should replace with a function?

@UziTech
Copy link
Member

UziTech commented Mar 14, 2019

It should be something like the way we check for nested parentheses in a link #1414. The regex should grab more than it needs if the end is in square brackets or parentheses then the lexer will run it through a function and find the actual end.

@x13machine
Copy link
Contributor

x13machine commented Mar 14, 2019

Why does group 3 match _[link_](url)_ correctly, while group 5 not match *[link*](url)* correctly? I going to refresh up on regex.

@UziTech
Copy link
Member

UziTech commented Mar 14, 2019

ya the regex doesn't treat _ and * the same but it should.

styfle added a commit that referenced this issue May 22, 2019
Fix `<em>` issue with mixed content #1410
@crystalfp
Copy link

A very similar problem is still here ("marked": "^0.7.0"):


const markdown = `Some say these principles require a revolution to be realized. Others
say we need massive innovation to make positive education futures a
reality. We believe we need both, or as Ronald van den Hoff (2013) says:
“What we really need is an *innovution*\!” (p. 236). And, this is our
noble quest: To *innovute* with not only our ideas, but also the
purposive applications of what we have learned through our individual
efforts, and together, globally.

Education Futures. *<http://www.knowmadsociety.com>*
`;

const marked = require("marked");

const html = marked(markdown, {smartypants: true, headerIds: false});

console.log(html);

The * around the <...> is not translated. The strange thing is that in this example the two pairs of asterisks in the text are correctly translated, instead in my application (exactly same text) the second and third asterisks are not translated.

Thanks for looking!

@crystalfp
Copy link

crystalfp commented Aug 3, 2019

The second error can be reproduced in the editor

@calculuschild
Copy link
Contributor

Closing this since all of the cases in the OP appear to be working correctly now as of , as well as the later issue by @crystalfp.

zhenalexfan pushed a commit to zhenalexfan/MarkdownHan that referenced this issue Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: inline elements category: mixed content L2 - annoying Similar to L1 - broken but there is a known workaround available for the issue
Projects
None yet
Development

No branches or pull requests

6 participants