Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid html generated when parsing link #130

Open
rchl opened this issue Jun 18, 2022 · 14 comments
Open

Invalid html generated when parsing link #130

rchl opened this issue Jun 18, 2022 · 14 comments
Labels
S: confirmed Confirmed bug report or approved feature request. T: bug Bug.

Comments

@rchl
Copy link
Contributor

rchl commented Jun 18, 2022

Python 3.3>>> mdpopups.show_popup(view, 'and [`i32`](`i32`) is')
Parse Error: ">i32</span>"><code class="highlight"><span style="color: #d8dee9;">i32</span></code></a> is</p></div> code: Unexpected character

The generated HTML is:

and <a href="<span style="color: #d8dee9;">i32</span>"><code class="highlight"><span style="color: #d8dee9;">i32</span></code></a> is

Which is invalid since the content of the href attribute is not escaped.

The escaping would make the code valid but I believe that what should happen is nothing in the link should be transformed into HTML and instead it should be taken literally, and then also escaped if needed.

This is how Github renders it:


and i32 is

and <a href="%60i32%60"><code class="notranslate">i32</code></a> is

@gir-bot gir-bot added the S: triage Issue needs triage. label Jun 18, 2022
@facelessuser facelessuser added T: bug Bug. S: confirmed Confirmed bug report or approved feature request. and removed S: triage Issue needs triage. labels Jun 18, 2022
@facelessuser
Copy link
Owner

Yeah, I'm actually aware of this, and I'll have a fix for the next version. There was some complaint at some point about the fact that when Sublime was processing callbacks in HTML links that escaped quotes weren't handled properly by Sublime. A change was made to not escape them, which is simply a bad idea that was rarely encountered. We just need to let them be escaped.

Anyways, this is an issue in mdpopups, not any of the underlying libraries.

@rchl
Copy link
Contributor Author

rchl commented Jun 18, 2022

And the fixed behavior will also not convert markdown to html in links?

@facelessuser
Copy link
Owner

And the fixed behavior will also not covert markdown to html in links?

I'm not sure I understand the question. We are talking about how proper escaping in HTML attributes right?

What do you mean by your question, can you clarify with an example?

@rchl
Copy link
Contributor Author

rchl commented Jun 18, 2022

The fixed logic could result in either:

  • <a href="%60i32%60">
    or
  • <a href="&lt;span style=&quot;color: #d8dee9;&quot;&gt;i32&lt;/span&gt;">

I believe that the former one is correct since markdown should not be processed in URIs.

@facelessuser
Copy link
Owner

Can you give me a reproducible Markdown source? I see the expected outputs, but I'm still not sure what you are providing as source.

@facelessuser
Copy link
Owner

Wait, are you trying to provide Markdown code syntax as a link? If so, you aren't getting that to convert properly. You need to provide a link or properly escaped content for reference.

@facelessuser
Copy link
Owner

You'll see all sorts of different outputs with different Markdown parsers. Some might handle it more sane than others, but you should not be doing this: https://johnmacfarlane.net/babelmark2/?text=%5B%60i32%60%5D(%60i32%60)

@facelessuser
Copy link
Owner

I honestly thought you were talking about something else.

@rchl
Copy link
Contributor Author

rchl commented Jun 18, 2022

It's the same example I gave in the initial comment:

'and [`i32`](`i32`) is'

It contains backticks in the link's URI and it comes from the LSP server.

I've talked about it in the initial comment and also shown how Github's parser handles it.

@rchl
Copy link
Contributor Author

rchl commented Jun 18, 2022

Just escaping the link will fix the main issue that causes parsing error so it might be enough to do that.

Since a link like the one provided here as an example will likely not do anything useful, it might as well contain escaped HTML but it feels like the ultimate fix would be to not process markdown in links, besides escaping.

@facelessuser
Copy link
Owner

It's the same example I gave in the initial comment:

Yes, I was on my phone, but I see that now. Basically, code is handled before URLs, which is why this happens. That is just the sequencing for how Python Markdown does things.

I've talked about it in the initial comment and also shown how Github's parser handles it.

Yep, how GitHub handles this is irrelevant. Currently, we are using the Python Markdown parser.

I've mentioned at one time that I'm potentially willing to add support for a CommonMark parser as soon as Package Control supports Py38 dependencies, though it's starting to feel like they may never happen...

Anyways, if you format the content in an escaped manner, it'll probably go through fine. I don't plan on patching the underlying library to workaround odd cases that it was not designed to handle. One could argue this is a bug, but it would be an upstream bug with Python Markdown, but as I'm familiar with its architecture, I'm doubtful a fix will be coming.

@facelessuser
Copy link
Owner

@rchl Hmm, there may be an issue with how we do things using the Sublime highlighter. Turning off the Sublime highlighter (which causes Pygments to be used instead) doesn't seem to exhibit this issue.

I wanted to do my due diligence before closing this issue, so there may actually be something for us to fix here. I'll try to dig deeper.

@facelessuser
Copy link
Owner

Hmm, it just appears that when Pygments handles a plain text code, it simply returns the text and styles it with CSS, but with the Sublime highlighter, we actually add inline styles, which is why we get HTML content. Really, Markdown should not be used as a URL.

I'm not sure if there is anything we can reasonably do, but I'll take a look if there is something simple we can do without trying to monkey patch Markdown itself.

@facelessuser
Copy link
Owner

facelessuser commented Jun 19, 2022

Yeah, we can break pygments with [`123`](`#!py3 123`) which will then perform highlighting which causes spans to be written:

<p><a href="<span class="mi">123</span>
"><code class="highlight">123
</code></a></p>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S: confirmed Confirmed bug report or approved feature request. T: bug Bug.
Projects
None yet
Development

No branches or pull requests

3 participants