Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Markdown] Decide how to represent live samples #3548

Closed
wbamberg opened this issue Mar 26, 2021 · 17 comments
Closed

[Markdown] Decide how to represent live samples #3548

wbamberg opened this issue Mar 26, 2021 · 17 comments
Labels
MDN:Project Anything related to larger core projects on MDN

Comments

@wbamberg
Copy link
Collaborator

MDN has a "live sample" system (not to be confused with interactive examples, or GitHub-hosted examples).

In this system we add the EmbedLiveSample macro to a page. This takes an id attribute value as an argument: this refers to the current page by default but may refer to a different page.

All the macro does is insert an <iframe> with particular attributes.

When Yari builds the page, when it finds these <iframe> elements, it looks in the current page (or in a different page, if that feature is used) for:

  • a <div> with the given ID, or
  • a heading element with the given ID.

If it finds either of these, it scoops up all code blocks in the identified <div>, or between the identified heading and the next heading, and uses them to build a page to populate the <iframe>.

The problem

The problem is that Markdown doesn't support id attributes or <div> elements. So how can we identify the code blocks that we want to participate in a live sample?

Solutions

I can think of three options here. Maybe there are more!

Option 1: just use heading IDs

Although we can't represent IDs in Markdown, we will generate IDs for headings from their text content. We'll do this anyway, so they are linkable.

So if we drop support for identifying live sample code blocks using a <div>, we can just use these heading IDs.

In reference pages at least, I think it is OK - even desirable - to drop support for using <div> to collect live sample code blocks. We talked about this a lot in stumptown/content linting days (for example there's a whole bunch about it here: mdn/stumptown-content#350), and the feeling was that it's better for users to present examples consistently, so it's very obvious which code blocks contribute to an example output. When we linted the JS docs we applied this rule, so now every example lives in its own named H3 section under H2#Examples (e.g. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map#examples). Although these aren't live samples, the same arguments should apply.

I expect the situation is a bit different outside reference docs. It would be worth doing some analysis to see how often <div> is used for this purpose, and how much work it would be to fix.

One disadvantage of relying on heading IDs is: when the IDs change, the macro call breaks. This is especially a problem for localization: people translate the page, including headings, the IDs change, and the example breaks. Note that this is not a new problem, it's one that exists in the current mechanism too.

(Note also that this is a version of the problem in #3483 (comment) (although that one is worse than this, because in this version at least translators can update the macro argument). In general, any time when we want the machines to derive meaning from something that also has meaning for humans, we'll see a version of this problem.)

Option 2: use a class identifier on the code blocks

In this option, code blocks get a word in their info string that identifies the example to which they belong. When Yari processes a live sample iframe, it collects all code blocks that contain the word:

```html live-sample-1
<p>Here's a paragraph</p>
```

```css live-sample-1
p {
    background-color: pink;
}
```

{{EmbedLiveSample("live-sample-1")}}

This is a neat solution, and gives authors a lot of flexibility.

One problem with this option is that it tends to treat an example as consisting only of the code. But for structuring content we would like to consider an example as consisting not only of the code blocks, but also of the title and explanatory prose: the whole thing should be treated as a single unit. For example, if we ever wanted to maintain examples as separate reusable units, that could be incorporated by different pages (or even inside separate contexts entirely, like tools), we would like to include all those parts, not just the code. The structure we defined (and linted for) in the JS docs is meant to encourage that view of examples.

Maybe this objection is too abstract though :).

Option 3: use headings implicitly

(Even though this option doesn't really work, I'll include it anyway in case we can make something like this work.)

This is like option 1, except with no reliance on header IDs, and one extra constraint: the macro call has to live under the same heading as the code blocks. Then we just implicitly scoop up all the code blocks at the same level as the macro call. Unfortunately this doesn't really work, because we usually want a structure like:

## Examples

### Making everything better

In this example we'll make everything better.

#### HTML

// HTML code block

#### CSS

// CSS code block

#### Output

// macro call

...and of course in this structure there are no code blocks under the same heading as the macro call. (We could make something like this work if we were prepared to commit to this structure to the extent that Yari generated these H4 headings, and this would have benefits for consistency. But I'm not sure we want to go that far at this point.)

@wbamberg wbamberg added the needs triage Triage needed by staff and/or partners. Automatically applied when an issue is opened. label Mar 26, 2021
@hamishwillee
Copy link
Collaborator

Markdown doesn't support id attributes or <div> elements. So how can we identify the code blocks that we want to participate in a live sample?

Many markdown systems support some subset of HTML in markdown files (including github). So we can choose to support these "if we want".

@wbamberg
Copy link
Collaborator Author

wbamberg commented Mar 29, 2021

Markdown doesn't support id attributes or <div> elements. So how can we identify the code blocks that we want to participate in a live sample?

Many markdown systems support some subset of HTML in markdown files (including github). So we can choose to support these "if we want".

Yes, we can, if we want. The baseline for MDN is going to be GitHub Flavored Markdown. This doesn't support a number of things we do in MDN (except insofar as any Markdown system supports raw HTML). So I'm filing issues for all these things, in which we are essentially choosing one of four options:

  1. support the feature by using raw HTML (for example, supporting note callouts using <div class-"note">)
  2. support the feature by extending the Markdown syntax (for example, supporting note callouts using blockquotes with magic words)
  3. stop having to support the feature by changing (generally simplifying) our docs
  4. generate the content using a macro: but this option is often not applicable and/or is a lot of work.

We want, I think, to avoid 1 and 2 as much as we can, because it makes the authoring experience more complicated, and the main reason for migrating to Markdown is to simplify things for authors. But I'm sure there will be cases where we really do need the feature, or even just where changing our docs is too expensive.

@hamishwillee
Copy link
Collaborator

Ah, thanks for the clarification. I agree that generally you want to avoid HTML and reduce markdown extensions if possible. I also have no idea of the best way to solve this problem.

So perhaps just some observations on above comments.

  • Yes titles can change. There is a lot to be said for making at least some of them static, and also to be able to scatter the occasional anchor point away from a heading. There is no markdown compatible way to do that, but I've used this syntax on gitbook to set the heading anchor.
    ## Heading text {#forced-anchor}
    
  • It is also convenient to be able to set anchors to places outside of headings "on occasion". It is not too complicated for users to use <a href="my-anchor"></a>, because it should be rare-ish

@escattone
Copy link
Contributor

Just wanted to provide some brief comments after long conversations with @wbamberg and @Gregoor about this topic.

  • I think option 1 above should be ruled out because of the localization problem. It should be a general rule I think that localizers never touch macro calls.
  • Both options 2 and 3 are viable (from a development perspective). Option 2 is more challenging in the one-time conversion phase, but easier in the many-time build phase (MD-->HTML), while option 3 is the reverse of that, easier in the one-time conversion phase, and a bit more involved (but not difficult) in the many-time build phase.
  • My vote doesn't count as much as the content folks, but I like option 3 as a starting point. It encourages/depends-on structure and avoids the need for any identifier at all, in either the code blocks or the macro call. Just embed a {{EmbedLiveSample()}} call wherever you'd like the live-sample result to live, and we'll search upwards from there to collect the HTML, CSS, and JS code blocks that form the live sample.
  • We could also support option 2 later if desired/needed.

@wbamberg
Copy link
Collaborator Author

wbamberg commented May 6, 2021

Thanks for the comment @escattone !

One thing to say here is that this is not much of a problem for the JS docs, which only have three live samples, and which are our immediate goal. But it is going to be a problem for our docs as a whole, and especially CSS, Learn, and Web APIs.

For reference pages, which constitute about 80-90% of the total pages in MDN, I think we could (and should) go with a quite restrictive rule here, like:

  • reference pages have an "Examples" H2
  • this is split into one or more H3 sections
  • each of those sections may contain at most one {{EmbedLiveSample}} call
  • this call will contain all code samples in that H3 section

This is pretty close to the rule @Elchi3 and @ddbeck and I came up with when we were linting pages.

A looser rule that might be easier for you to implement could be:

  • consider that headings break up a document into sections, that can be nested inside each other
  • if you see a live sample call, get all the code blocks in the same "heading section" as the call itself
  • if there aren't any code blocks in the live sample call's own "heading section", go to the next level up and look for code samples
  • stop when you find code samples, or reach the top level of the document

So for example:

## Examples

### An example

// JS code block for EmbedLiveSample1

{{EmbedLiveSample1}}

### Another example

#### JavaScript

// JS code block for EmbedLiveSample2

#### CSS

// CSS code block for EmbedLiveSample2

#### Result

{{EmbedLiveSample2}}

In the first example, the code blocks are found in the same heading section as the macro call, and in the second, they are found in a common parent section.

Although I think this would work for reference pages, it will probably break down in other pages, where people do all kinds of things with live samples. I think we need to do some real digging there to work out what's best. I expect a lot of the time "what's best" will be to stop using the live sample system, but I wouldn't really want to prejudge it.

So what I think would be best would be to implement one of these rules for JS, whichever is easier really, and when we look at other areas of MDN in later phases, see if we need to adjust. Does that sound OK?

@ddbeck
Copy link
Contributor

ddbeck commented May 11, 2021

{{EmbedLiveSample2}}

So what I think would be best would be to implement one of these rules for JS, whichever is easier really, and when we look at other areas of MDN in later phases, see if we need to adjust. Does that sound OK?

I'm a bit spooked by these together. Are you proposing a sort of dynamic set of macro calls, as a bridge between options 2 and 3? Because we might be better served by separating concerns a bit. Consider option 2+3, where we use the explicit code block identifiers to link macro calls to code and facilitate conversion, but separately, lint the content to enforce the page structure. This might be more friendly to authors even, as it would give us extra handles for fixing malformed structures (e.g., autofixing improper heading depths) than relying on structure alone.

@wbamberg
Copy link
Collaborator Author

wbamberg commented May 11, 2021

{{EmbedLiveSample2}}

So what I think would be best would be to implement one of these rules for JS, whichever is easier really, and when we look at other areas of MDN in later phases, see if we need to adjust. Does that sound OK?

I'm a bit spooked by these together. Are you proposing a sort of dynamic set of macro calls, as a bridge between options 2 and 3?

Option 2 doesn't exist at the moment, so I'm not sure why we would build a bridge from it.

At the moment the ID passed into live samples is mapped to a heading ID or a div ID. That won't work in Markdown-land, but we need to keep supporting it while there is still content in HTML.

So we would support this new way and the old way in parallel, so for example:

  • if EmbedLiveSample is given an argument, then assume we are in the old world, and use it to find headings/div elements with that ID.
  • if EmbedLiveSample is not given an argument, assume we are in the new world, and use the "option 3" rule to find code blocks.

Then as we prepare content for conversion, make sure the live samples are compatible with option 3, and when we convert to Markdown, remove the ID arguments to EmbedLiveSample.

I do think linting would help to identify whether live samples are compatible with option 3, which is potentially a big task (we have ~2300 EmbedLiveSample calls in our docs).

Does that make sense?

@ddbeck
Copy link
Contributor

ddbeck commented May 11, 2021

@wbamberg In #3548 (comment) you've introduced numbered macro calls: {{EmbedLiveSample1}}, {{EmbedLiveSample2}}. I interpreted this as being somewhat like option 2 (macro calls that are specific to each example), but with some more magic (using numbers and the page outline to select relevant code blocks, instead of explicit names). I took this to be a new proposal (option 4, say). And by "bridge" I meant combination or something like that—I wasn't suggesting it already existed.

To try to explain my suggestion better: after seeing your example, I thought it would be nice to have the explicit connection between source and macro call at the point of the macro call, while also enforcing structure. But the more explicit macro calls and the page structure enforcement don't have to be the same thing, necessarily. That's why I thought 2 (macro) + 3 (structure enforcement) would be nice: you could convert macro calls without changing the page structure, which might involve some serious content changes.

OK, now something new:

this new way and the old way in parallel

I understand we have to support two ways in parallel, but I'd really like it if we couldn't have both the old and new ways within the same document. I don't know if it's possible or plausible, but I'd love to see new world depend on the .md extension.

@Ryuno-Ki
Copy link
Collaborator

I expect the situation is a bit different outside reference docs. It would be worth doing some analysis to see how often <div> is used for this purpose, and how much work it would be to fix.

I can partly shed light on this. These are the numbers of EmbedLiveSample calls

  • tools: 3
  • glossary: 9
  • mdn: 20
  • mozilla: 1
  • web/guide: 9
  • web/events: 1
  • web/javascript: 3
  • web/accessibility: 11
  • web/api: 595
  • web/svg: 293
  • web/progressive_web_apps: 1
  • web/web_components: 1
  • web/html: 361
  • web/css: 859
  • learn: 183
  • games: 4

@wbamberg
Copy link
Collaborator Author

@wbamberg In #3548 (comment) you've introduced numbered macro calls: {{EmbedLiveSample1}}, {{EmbedLiveSample2}}. I interpreted this as being somewhat like option 2 (macro calls that are specific to each example), but with some more magic (using numbers and the page outline to select relevant code blocks, instead of explicit names). I took this to be a new proposal (option 4, say). And by "bridge" I meant combination or something like that—I wasn't suggesting it already existed.

That wasn't my intention, and I'm sorry to be so misleading! The pre block was supposed to be showing only option 3, so really they are just both {{EmbedLiveSample}} calls. It's supposed to be showing the two main ways that option 3 can work out: either the code block(s) are under the same immediate heading as the macro call, or they share a parent heading.

To try to explain my suggestion better: after seeing your example, I thought it would be nice to have the explicit connection between source and macro call at the point of the macro call, while also enforcing structure. But the more explicit macro calls and the page structure enforcement don't have to be the same thing, necessarily. That's why I thought 2 (macro) + 3 (structure enforcement) would be nice: you could convert macro calls without changing the page structure, which might involve some serious content changes.

So the idea is:

  • in phase 1, just convert live samples using option 2, and not worry about structure
  • simultaneously, or after, or whatever, start linting pages for code sample structure, using something like option 3
  • fix up pages so the linter passes, independent of moving to Markdown
    ?

We could do this. I don't know. I think I like the idea of fixing up the code samples as we go, and it seems simpler (fewer projects/processes). At the end of the proposed process, once the structure is fixed, what is the actual spec for live samples? Do people still have to include the code block IDs, even though they're not really needed because the code structure is enough?

But this choice seems to depend partly on how much work it is to fix up the content structure. As I said upthread, we don't yet have a very clear idea of how much we need to change content for live samples, and we won't until we spend the time analysing that for areas outside JS. My priority right now (this quarter) is JS, so I would like us to make what seems like a reasonable choice now that will work for JS, and do a proper analysis of the other areas when we need to. If that means switching our approach, well, there are only 3 live samples in JS, so we haven't lost much.

OK, now something new:

this new way and the old way in parallel

I understand we have to support two ways in parallel, but I'd really like it if we couldn't have both the old and new ways within the same document. I don't know if it's possible or plausible, but I'd love to see new world depend on the .md extension.

I think by the time the Yari code that processes EmbedLiveSample gets to run, it's already looking at HTML in either case. So I'm not sure how practical that is.

@ddbeck
Copy link
Contributor

ddbeck commented May 14, 2021

Thanks for the clarifications, @wbamberg!

Based on this and our separate discussion the other day, I agree that it makes sense to go with option 3, where the structure drives code block selection. As you put it, in practice, a live sample is more than just its constituent code blocks, so it makes sense that the definition of a live sample should also include the content associated with it. 👍

@wbamberg
Copy link
Collaborator Author

wbamberg commented Jun 3, 2021

I want to write this up in the Markdown spec, so I've had a go at drafting it here. Does this make sense? Do we still like it?

(I've also been looking through our docs to see where we use div IDs, and it doesn't look too bad. I think places where we already use heading IDs will very likely already be compatible with this approach.)


In MDN writers can create "live samples" in which code blocks (HTML, CSS, JavaScript) in the page are built into a document that's displayed in the page in an iframe. In this way authors can show code and its output together, and the displayed code is the code used to generate the output so there's no risk of the code and the output getting out of sync.

To create a live sample, writers use the {{EmbedLiveSample}} KumaScript macro.

To find the code blocks that should belong to a live sample, we consider that page headings (H2-H6) implicitly divide up the document into nested sections. So, for example:

                H2#Syntax   H3#Params   H3#Return   H2#Examples   H3#Example1
H2 Syntax           |
                    |
Some content        |
                    |
H3 Params           |           |
                    |           |
More content        |           |
                    |           |
H3 Return           |                       |
                    |                       |
Yet more content    |                       |
                    |                       |
H2 Examples                                           |
                                                      |
H3 Example1                                           |            |
                                                      |            |
An example                                            |            |

In this example there are five sections, one for each heading, and the sections defined by the H3 headings are nested inside the section defined by the H2 headings above them.

Given this model, the rule for finding the code blocks to include in a live sample is:

  1. Define the "immediate section" as the most-nested section that directly contains the macro call itself.
  2. Look for any code blocks in the immediate section. If you find any, they are the code blocks to use. Stop looking.
  3. If you don't find any, go to the next level up, and look for any code blocks in that section. If you find any, they are the code blocks to use. Stop looking.
  4. If you don't find any, repeat (3) until you find code blocks or until you reach the top-level of the document.
  5. If you reached the top level and didn't find any code blocks, this is an error.
  6. If you did find code blocks, build the document and insert the iframe where the macro call was found.

For example, in the case above, if the macro call was in H3#Example1, we would look in: H3#Example1, then (if no code blocks were found there) in H2#Examples.

Examples

The simplest example is where the code blocks and the macro call all live in the same immediate section. For example:

H2 Examples

  H3 Example1

  JS-code-block-1
  CSS-code-block-1
  {{EmbedLiveSample}}

  H3 Example2

  JS-code-block-2
  CSS-code-block-2
  {{EmbedLiveSample}}

In this page we have 2 live samples:

  • the one under H3#Example1 contains JS-code-block-1 and CSS-code-block-1
  • the one under H3#Example2 contains JS-code-block-2 and CSS-code-block-2

Another common pattern is where the macro call wants to use code blocks in a sibling section:

H2 Examples

  H3 Example1

    H4 JS

    JS-code-block-1

    H4 CSS

    CSS-code-block-1

    H4 Result

    {{EmbedLiveSample}}

  H3 Example2

    H4 JS

    JS-code-block-2

    H4 CSS

    CSS-code-block-2

    H4 Result

    {{EmbedLiveSample}}

This also produces 2 live samples:

  • the one under H3#Example1>H4#Result contains JS-code-block-1 and CSS-code-block-1
  • the one under H3#Example2>H4#Result contains JS-code-block-2 and CSS-code-block-2

This model also means that certain arrangements are not possible. For example:

H2 Examples

JS-code-block-1
CSS-code-block-1
{{EmbedLiveSample}}

  H3 Example2

  JS-code-block-2
  CSS-code-block-2
  {{EmbedLiveSample}}

This will give unexpected results, because the {{EmbedLiveSample}} call directly under H2#Examples will try to include the code blocks under H3#Example2.

@Gregoor
Copy link
Contributor

Gregoor commented Jun 4, 2021

This comes just-in-time as I started implementation on it today!

I was wondering if we would want to allow the macro to come before the code-blocks. I.e. should this work?

H2 Examples

  H3 Example1

    H4 Result

    {{EmbedLiveSample}}

    H4 JS

    JS-code-block-1

    H4 CSS

    CSS-code-block-1

I would assume yes, because it is still in the same "section", but just want to make sure.

@wbamberg
Copy link
Collaborator Author

wbamberg commented Jun 4, 2021

In the spec as I've written it, yes, this will work. It might be worth calling out in an example.

I don't expect this is a very common use case. Choosing what to do here seems to be mostly about getting the least surprising experience for authors (and secondarily the easiest thing to implement). Allowing code blocks after the macro seems to make for a simpler specification.

@Rumyra Rumyra removed the needs triage Triage needed by staff and/or partners. Automatically applied when an issue is opened. label Jun 7, 2021
@ddbeck
Copy link
Contributor

ddbeck commented Jun 7, 2021

Does this make sense? Do we still like it?

Yes, I think so. Nicely done. But when there's a PR, I would like to have a go at suggesting some better ASCII art. 😄

@wbamberg
Copy link
Collaborator Author

wbamberg commented Jun 7, 2021

Thanks! Filed as #5773. I'm looking forward to some mind-blowing ASCII art.

@Rumyra Rumyra added the MDN:Project Anything related to larger core projects on MDN label Jun 8, 2021
@sideshowbarker
Copy link
Collaborator

I propose we move this to the Discussions tracker.

@wbamberg wbamberg closed this as completed Jun 8, 2021
@mdn mdn locked and limited conversation to collaborators Jun 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
MDN:Project Anything related to larger core projects on MDN
Projects
None yet
Development

No branches or pull requests

8 participants