Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better compatibility with CommonMark #1125

Closed
slorber opened this issue Jun 30, 2020 · 21 comments
Closed

Better compatibility with CommonMark #1125

slorber opened this issue Jun 30, 2020 · 21 comments
Labels
🙅 no/wontfix This is not (enough of) an issue for this project

Comments

@slorber
Copy link
Contributor

slorber commented Jun 30, 2020

Subject of the feature

Sometimes the content that people write using markdown should be compatible with external markdown-related tools (github markdown, markdownlint...) that do not rely on MDX. Having better compatibility with CommonMark ecosystem would be nice.

Problem

https://daringfireball.net/projects/markdown/syntax#html

For any markup that is not covered by Markdown’s syntax, you simply use HTML itself. There’s no need to preface it or delimit it to indicate that you’re switching from Markdown to HTML; you just use the tags.

https://mdxjs.com/getting-started#markdown

MDX supports standard Markdown syntax

The MDX doc statement is not totally true:

  • <span style="color: red">Hello</span> => parse error in MDX
  • <span style={{ color: 'red' }}>Hello</span> => renders strangely on CommonMark systems

Not being compatible with CommonMark means that if you need a source file to display correctly in MDX env + CommonMark env, you don't always have a solution.

Expected behavior

That would be nice if MDX was more compatible with existing markdown ecosystem, or highlighted more clearly how it currently diverges, and what are the end goals regarding compatibility.

Alternatives

Wondering if MDX could try to convert html tags to jsx as a compilation step? (for example using https://transform.tools/html-to-jsx)

Wondering if MDX could offer some kind of "compatibility mode". I work on Docusaurus v2, and some of our users don't really care about using React components in markdown, they just want to use regular markdown. It would be very nice if we could support 2 modes (one enabling embedding React comps, the other being regular markdown), keeping MDX as the only markdown compiler, and without having to duplicate the work (we rely heavily on things like MDXProvider currently).

Wondering if it could be possible to escape regular html syntax with some custom tag or whatever, so that we could embed React comps in a MD doc, yet also be able to use html syntax at the same time.


Related discussion on our repo: facebook/docusaurus#3009

Our v1 -> v2 migration guide (from CommonMark to MDX) ask users to perform some markdown changes to make it compatible with MDX. That could be nice if it didn't require any change. https://v2.docusaurus.io/docs/next/migrating-from-v1-to-v2#update-markdown-syntax-to-be-mdx-compatible

@slorber slorber added 🙉 open/needs-info This needs some more info 🦋 type/enhancement This is great to have labels Jun 30, 2020
@borekb
Copy link

borekb commented Jun 30, 2020

MDX being "enriched Markdown" (but still Markdown!) is indeed a dream. In my experience, people are often surprised that their valid MD documents fail to parse with MDX, which is probably due to common marketing that "MDX is Markdown for the component era – it lets you write JSX embedded inside Markdown", which is not strictly speaking true.

"Unfortunately" (for me), there are pull requests like #1039 that seem to double-down on the divergence from Markdown. It probably has good technical reasons, and maybe even MDX being CommonMark + JSX isn't achievable on some general grounds, but from the user perspective it would be absolutely great!

@wooorm
Copy link
Member

wooorm commented Jun 30, 2020

How would supporting both HTML and JSX work?

Most issues reported here are because MDX is, in fact, more like Markdown, where folks expect more JSX behavior. Hence 1039.

@wooorm
Copy link
Member

wooorm commented Jun 30, 2020

Btw, if you want HTML instead of JSX to render to React components, you can do that with unified directly.

@borekb
Copy link

borekb commented Jun 30, 2020

@wooorm I completely see your point and in fact, I'd describe MDX as "JSX written differently, with some places allowing Markdown". Maybe that's not entirely accurate either but in practical terms, it would have helped me to get a better idea in the past.

The "problem" is that since its beginnings, MDX was marketed as Markdown allowing React components, e.g.:

  • MDX docs: "It’s a superset of Markdown syntax".
  • Gatsby docs: "MDX is Markdown for the component era. It lets you write JSX embedded inside Markdown."
  • mdx-deck: "Write presentations in markdown"

I understand that it's an exciting way to present it but it also strongly suggests that MDX is Markdown which is not true, neither technically (valid CommonMark documents throw parse errors in MDX) nor philosophically (falling back to HTML is one of the core principles of Markdown).

I don't know how HTML and JSX would work together technically but with so many parsers & tools around, e.g., Prettier's parsers or html-to-jsx, I imagine it should be possible. I'll admit though that I don't have any technical knowledge about what they do, exactly; you're much more of an expert on this matter.

I'd just like to say that from the user perspective, MDX being a true superset of Markdown would be awesome. I cannot really use unified directly if higher-level tools like Docusaurus or mdx-deck decide to build on MDX, which they often do because "Markdown + React components" is indeed a great value proposition.

@wooorm
Copy link
Member

wooorm commented Jun 30, 2020

My go to is MDX = Markdown - HTML + JSX. I see where you’re coming from, but a) it’s ambiguous whether HTML inside Markdown is Markdown, or an alternative syntax, and b) it’s pretty hard to explain things in one sentence. E.g., what’s a chair?

I imagine it should be possible

I don’t see it as possible, except in a way that creates authoring inconsistencies and security issues (compiling as JSX, when failing, assuming HTML). The problem isn’t getting HTML to JSX. The problem is always mixing Markdown with either HTML or JSX, and you want both. 🤯

I'd just like to say that from the user perspective

This depends on the user. There are many ways to turn Markdown + HTML into React. MDX is something else. MDX is JSX. That comes with a different user perspective too (e.g., developers with JSX familiarity)

Docusaurus

Other projects, like Gatsby and Next, can support either Markdown or MDX. I understand that Docusaurus users may not want MDX, in which case, I think it makes sense to have Markdown there!

mdx-deck

Well, it’s in the name. The goal here is to write JSX, not HTML.

@borekb
Copy link

borekb commented Jun 30, 2020

You have valid points, and making HTML and JSX coexist would be difficult indeed. I'm not yet convinced that it's entirely impossible but it's probably close to that 😄.

@borekb
Copy link

borekb commented Jul 1, 2020

@wooorm A few more questions if you don't mind:

Question 1: Would it be possible to treat blocks that start with <html-like-tag> as HTML and <ReactLikeComponent> as JSX? For example, could the behavior be theoretically like this?

Choice between HTML and JSX (both work fine):

- Hello <span style="color: red">world</span>
- Hello <MySpan style={{color: "red"}}>world</MySpan>

(Note: I understand that some people might actually prefer to write the JSX version of the lower-case span, i.e., <span style={{color: "red"}} />; I'm just trying to figure out if there could be syntax supporting both HTML and JSX.)

Nesting JSX inside HTML is fine:

<div style="color: red"><MyComponent>This is fine</MyComponent></div>

But once inside JSX, everything is JSX, there's no going back:

Hello <MyComponent><span style={{color: "red"}}>world</span></MyComponent>

More complex example:

image

Source
<details>
<summary>
  **MD does not work**
  <b>HTML works</b>
  <Component><span style={{color: "red"}}>JSX works</span></Component>
</summary>

- One
- Two

</details>

Could this work?


Question 2: If we ignore HTML and only write document that don't contain any HTML tags, how close is MDX to CommonMark? My feeling is that most of the common syntax works but there are things like nesting that are different. Can the difference be quantified somehow?

Thanks!

@wooorm
Copy link
Member

wooorm commented Jul 1, 2020

“HTML” in Markdown is not actually HTML. It looks like HTML.

E.g.,

<div

Or

Paragraph <i> done.

CommonMark requires that incorrect HTML be treated as HTML. There are also many things that are valid in HTML but aren’t valid in CommonMark.

Finally, many folks have raised issues stating that they feel the CommonMark way of putting HTML/JSX in Markdown does not make sense for MDX: #195 (comment). Hence 1039.

@borekb
Copy link

borekb commented Jul 1, 2020

I understand that the JSX project might not want to support HTML (or that strange, semi-correct HTML that CommonMark supports), I was just wondering whether the proposal above has some clear flaws from the syntax perspective.

As for the second question, do you have an idea how much MDX diverges from CommonMark if it weren't for HTML?

@wooorm
Copy link
Member

wooorm commented Jul 1, 2020

I was just wondering whether the proposal above has some clear flaws from the syntax perspective.

The “flaw” is that your proposal requires understanding which tags are used to switch between JSX or HTML. This is information that CommonMark does not have.

As for the second question, do you have an idea how much MDX diverges from CommonMark if it weren't for HTML?

That is currently only in HTML (and imports/exports). For MDX@2, I explained that in the PR body: #1039

@slorber
Copy link
Contributor Author

slorber commented Jul 1, 2020

Thanks for the explainations.

At Docusaurus we like MDX, we are not willing to change that, as many React related docs websites like to embed components in their docs, and support for Vue/Svelte is likely to be appreciated too.

But we try to make users of CommonMark happy too, and that can help migrations from other doc systems based on it, also it avoids "MDX lock-in" if you don't use any embedded components.

Wonder if it would be possible to declare for a whole doc if we are using HTML or JSX (for a whole doc).
Maybe this could be a global Docusaurus setting, or a frontmatter toggle, or using the file extension?
(I'd compare this a but to the .ts vs .tsx extension, which affects TypeScript parsing of <>)

In case the html mode is selected, MDX could use the dangerouslySetInnerHTML for the embedded html tags (not sure how md interleaving + MDXProvider could work, just an idea...).

I still want the React/JSX output because I definitively want to use the MDXProvider to customize with React the regular MD elements. Having MDX being able to process CommonMark means we could use a unified interface for CommonMark docs + MDX docs

Also, a weird idea would be to convert the style string to style object in custom components. That probably wouldn't solve all the usecases but, could solve at least some of them 🤪 can probably be done in userland with MDXProvider, could give this a try

@borekb
Copy link

borekb commented Jul 1, 2020

@slorber Yes, I don't quite understand why Docusaurus parses .md files with MDX when that extension means Markdown. I think it could quite easily run .md files through some standard CommonMark parser and then convert it to MDX internally if that's what it needs to do, but I don't quite see why users should be exposed to this implementation detail.

I was actually going to post back to facebook/docusaurus#3009 since it's pretty clear that this will need to change in Docusaurus (MDX keeps diverging from CommonMark – #1039).

@borekb
Copy link

borekb commented Jul 1, 2020

@wooorm

understanding which tags are used to switch between JSX or HTML. This is information that CommonMark does not have.

I know, I'm not saying that CommonMark should be doing that; instead, I'm imagining a MDX parser that works like this:

  1. Replace all JSX parts of the document with some safe placeholders
  2. Run any CommonMark-compliant parser on it, get HTML
  3. Replace placeholders back to the JSX parts and run a final transformation to JSX

In reality, step 2 will parse to AST etc. but generally, such approach feels pretty unambiguous to me. (Of course I'm aware of #1039 so this is just exploring whether in an alternative universe, MDX could possibly be a true superset of CommonMark. 😄)

@slorber
Copy link
Contributor Author

slorber commented Jul 1, 2020

@borekb we use a single parser for .md and .mdx because otherwise it would duplicate the integration work. Using a regular markdown parser, we wouldn't be able to let you style MD docs in the same way MDX allows us to "override" default html element implements with custom React components.

So, as a end-user, for MDX you could customize with React components + MDXProvider, and for MD, you'd have to use CSS + vanilla JS, for the same markdown elements. This also duplicates the work in userland, which is far from idea for us. I'll explore using https://github.com/rexxars/react-markdown as it also has ability to render markdown elements with custom renderers, so I could try to build a unified md interface, switching from MDX to CommonMark with a toggle. Will try to explore that (=> facebook/docusaurus#3018).

@wooorm
Copy link
Member

wooorm commented Jul 1, 2020

I'll explore using https://github.com/rexxars/react-markdown

Why not look at the stuff that react-markdown and mdx are using under the hood? unified. As react-markdown hasn’t been updated in a while. Mapping HTML names to components is possible.

And I’d suggest using extensions: .md and .mdx.

@slorber
Copy link
Contributor Author

slorber commented Jul 1, 2020

Thanks for your time, will look at this

@wooorm
Copy link
Member

wooorm commented Nov 13, 2020

I’m going to close this because the main question: support both JSX and HTML in MDX, isn’t viable in my opinion, as discussed above.

For CM, see remark, which is used in MDX as well, and has recently gotten 100% CM compatibility.

For CM in MDX (minus the HTML), first we want MDX@2 out of the door, and then in a version later we’ll also land remark@13 here, which will improve CM compliance.

@wooorm wooorm closed this as completed Nov 13, 2020
@wooorm wooorm added 🙅 no/wontfix This is not (enough of) an issue for this project and removed 🙉 open/needs-info This needs some more info 🦋 type/enhancement This is great to have labels Nov 13, 2020
@slorber
Copy link
Contributor Author

slorber commented Nov 13, 2020

thanks

yeah I'll figure things out on my side. What we want in the end is mostly a uniformization of md/mdx rendering in term of design, and I suppose current tools can allow me to do that.

@wooorm
Copy link
Member

wooorm commented Nov 13, 2020

I think so yeah. And: I’d personally like to make the two more similar but that’ll take some iterations. Feel free to ask me more Qs tho!

@borekb
Copy link

borekb commented Mar 27, 2021

I wasn't aware about mdsvex but that looks fantastic to me – the ability to freely mix Markdown with components, both block-level and inline, is something that I'd love to see in MDX.

https://mdsvex.pngwn.io/

Screen Shot 2021-03-27 at 11 18 43

@wooorm
Copy link
Member

wooorm commented Mar 27, 2021

how does that relate to this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙅 no/wontfix This is not (enough of) an issue for this project
Development

No branches or pull requests

3 participants