Add attribute support on fenced code blocks #75

wiltsecarpenter · 2023-10-26T19:15:53Z

This PR changes our fenced code block syntax to support attributes in { } sections, like

 ```js {run:false}
 const x = 2;
 \```

The markdown syntax for attributes includes singleton attributes, like show, key/value attributes like show: TRUE, quoted attributes like tip="Click here", class names like .toc-level1, and ids like #id. In this PR, we are parsing these different forms, but only the "show" and "no-run" attributes do anything.

An approximate grammar for the fenced code block info string is as follows:

    info: empty | language_id | attributes | language_id ows attributes
    language_id: [\w]+
    attributes: '{' attribute_list '}'
    attribute_list: attribute | attribute_list ',' attribute
    attribute: key | key ':' value | class_name | id_name
    key: ows identifier ows
    value: ows value_content ows
    value_content: quoted_string | identifier | true | false | "true" | "false"
    class_name: '.' identifier
    id_name: '#' identifier
    identifier: [a-zA-Z] [a-zA-Z0-9-_.]*
    quoted_string: '"' ([^"] | \")* '"' | "'" ([^'] | \')* "'"
    ows: [ \t]*

Notes from design review

Decisions: Going for a more JavaScript-y / json-esque syntax for attributes

Language outside of curly braces, after 3 backticks (keeps Prettier compatibility), e.g. ```js
Attributes are in curly braces, e.g. {echo: false}
Use : (instead of =) to define attributes (e.g. echo: false, not echo=false)
Separate attributes by comma (e.g. {echo: false, eval: false})
We need to formalize attribute handling (and do a bunch of tests), write down representation of grammar
Because of expected use cases, stick with echo: false as default
Undecided: What options do we really want to support? Now & in the future?

So, all together, our current decisions for code chunk and attribute styling stand at:

// awesome JavaScript code here

References

pandoc: fenced_code_attributes
r markdown: https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf
python markdown: https://python-markdown.github.io/extensions/fenced_code_blocks/#:~:text=Fenced%20Code%20Blocks%20are%20defined,on%20a%20line%20by%20themselves.
vitepress: https://vitepress.dev/guide/markdown

…e/attrs

mootari · 2023-10-28T10:06:33Z

Do we need to get to a decision in #28 before moving on here?

It seems to me that

```js {no-run}

would be equivalent to

```js

once we switch to only parsing

```{js}

wiltsecarpenter · 2023-10-30T15:09:58Z

Yes, in this pr js and {js} or ```js {attributes...} are equivalent. I believe that's consistent with pandoc and others, but we could make further restrictions if we want.

…e/attrs

mootari · 2023-10-30T17:33:03Z

To add to the above, the CommonMark spec calls it the "info string" and describes it as follows:

The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag. However, this spec does not mandate any particular treatment of the info string.

The only restriction is:

The info string may not contain any backtick characters. (The reason for this restriction is that otherwise some inline code would be incorrectly interpreted as the beginning of a fenced code block.)

wiltsecarpenter · 2023-10-30T22:23:51Z

I mis-stated the response before to #75 (comment),

With this change,

```js {no-run}

is equivalent to

```{js no-run}

meaning that the language can be before the { } attributes, or it can be the first attribute. This is intended to follow the syntax of pandoc, although it is not quite the same since pandoc allows the language to be specified as a class, like

```{.js}

which I didn't add here, but we could later if we wanted to.

mootari · 2023-10-30T22:37:22Z

Will we still be able to make

```js

behave like

```{js no-run}

at a later point, so that authors don't get drowned in errors when they add a chunk of text with static fenced code blocks to their notebooks?

mbostock · 2023-10-30T23:15:16Z

Let’s get together and discuss this in realtime. Everyone will have opinions on this!

If we want to follow Quarto, this is the same as in vanilla Markdown:

```js
1 + 2 // this code echoes, but doesn’t execute
```

You add curly braces if you want the code to run (“execute”), the code shows (“echoes”) by default, and also implicitly outputs in the case of an expression:

```{js}
1 + 2 // this code echoes, executes, and outputs a three
```

If you don’t want to echo the code, you can either use a top-level frontmatter:

---
execute:
  echo: false
---

Or you can use a comment option:

```{js}
//| echo: false
1 + 2 // this code runs and displays a three, but doesn’t show the code
```

I’m not entirely clear whether output: false means to disable only the implicit display, or if it disables explicit displays, too… that’s also something we should figure out.

I don’t personally like the comment options (slashpipe?) — they seem quite verbose — and it sounds like @allisonhorst doesn’t either. I do like the top-level YAML frontmatter for changing the behavior for the entire ~~notebook~~ page, though I wouldn’t try to follow Quarto 100% (e.g., I don’t know if we need to scope the echo option under execute, or if it’s fine for it to live at the top level like the title).

So, maybe we want to deviate from Quarto in favor of the R Markdown style, where the options are part of the info string? In that case, the echo = FALSE syntax that R Markdown uses feels like an R-ism; it’s how you would pass a named argument in R (or Python). The syntax used by Quarto feels more like YAML, and more similar to JavaScript for a named option. Maybe that means something like this?

```{js echo: false}
1 + 2 // this code runs and displays a three, but doesn’t show the code
```

Or perhaps:

```{js} echo: false
1 + 2 // this code runs and displays a three, but doesn’t show the code
```

But then I’m not sure how we might also mix-in arbitrary CSS classes, ids, or HTML attributes — I’m also not sure if we even need that capability, and feel the execution options are more pressing, but of course it would be nice to have a syntax that could potentially support both if we need it. I guess another option would be CSS selector syntax:

```{js} [echo=false] .toc #example
1 + 2 // this code runs and displays a three, but doesn’t show the code
```

And then we’d run a CSS selector parser on it, like parsel — or just extend the parser you’ve already written slightly to use brackets for attributes?

Quarto also supports double curlies for unexecuted blocks but I’m not sure if we need that. (That seems the same as no curlies, except that the triple backticks are also shown in the output?)

wiltsecarpenter · 2023-10-31T00:22:20Z

I added a few references to the PR comments to other implementations. This PR is based on the pandoc syntax (mostly), which is used in a number of places, sometimes with comma separators. I don't see any other places that use Json-style syntax though.

allisonhorst · 2023-10-31T01:21:56Z

A bit more context from Quarto docs re: comment options #| and //|: "Quarto uses this approach to both better accommodate longer options like fig-cap, fig-subcap, and fig-alt as well as to make it straightforward to edit chunk options within more structured editors that don’t have an easy way to edit chunk metadata (e.g. most traditional notebook UIs)."

I think both are useful considerations.

mbostock · 2023-10-31T01:31:09Z

Thanks @allisonhorst, that’s helpful context! If I understand correctly, those motivations don’t apply to us: we use Plot options for figure captions, titles, and alt text; and we don’t have a cell-based editor, but instead now use a file-based editor for code blocks.

Looking forward to discussing in real-time.

allisonhorst · 2023-11-03T19:35:22Z

Notes from 2023-11-03 Attributes design discussion

…e/attrs

mbostock · 2023-11-16T02:01:40Z

I added more tests, in part so I can understand the logic here. My current thinking is:

Remove support for specifying classes (not currently needed)
Remove support for specifying ids (not currently needed)
Think more about whether our options should mirror Quarto (execute, display, echo)
Think more about unquoted attribute values
Think more about degenerate cases

In particular, I think the current interpretation of these inputs are probably confusing:

{string: "true"}
{date: 2021/01/01}
{a b, c}

I’d like to lean on precedent here, but I haven’t decided yet if the attributes should behave more like YAML (but comma-separated?), HTML, or JavaScript/JSON. It feels a little none of the above right now, and which increases the burden on us to document, explain, and test. I’ll think more before deciding.

mbostock · 2023-11-16T02:10:14Z

I also don’t know yet if we want to support arbitrary attributes, or if we should instead parse for known attributes — for example knowing that the show attribute must be a boolean, and warning if you give it an invalid value. And for that matter, whether we want to issue warnings or errors for invalid inputs, as I think folks may be confused why the attributes aren’t having an effect if we silently ignore malformed input.

wiltsecarpenter · 2023-11-16T17:54:22Z

As you know, my initial assumption was that we would want to follow industry precedents for markdown syntax such as what is supported by pandoc and markdown-it and that this would be easiest for people to learn about and potentially fit in with existing editor extensions. Those other systems allow attributes to be specified on heading tags, lists, and inline code as well as fenced code blocks. The design of the code here is to separate the syntax parsing of attributes from their semantic application with the further assumption that there would be follow-on PRs that would add more options over time. It sounds though like you have some concerns that there could be some kind of un-intended consequences.

mbostock · 2023-11-16T20:45:18Z

I’ve rewritten this so that it now parses attributes using HTML-style syntax, borrowing code from Hypertext Literal to ensure that this matches the behavior of the HTML5 specification (as much as possible — obviously we’re not parsing HTML tags here).

The syntax looks like this:

```tag name1 name2=value
code
```

You can also use quoted values if desired, either single or double quotes. For example:

```tag name="value"
code
```

I agree there is a strong desire for precedent. But there are many, many subtleties when writing a parser, and while the last proposal was similar to Pandoc’s fenced code block options, it had many quirks that we would have had to document and test.

I don’t want to invent a new DSL here. And while I considered parsing the info string as JavaScript or JSON, that would be quite a bit more verbose and strict than we intend. So I’ve chosen to adopt HTML attribute syntax which supports shorthand attributes (with no values), doesn’t require commas between attributes, and allows quoted values if needed. This gives us support for arbitrary options, even though we only need show, run, and display (or equivalents) to start.

As for exposing the ability to set arbitrary CSS classes, identifier, etc., we can do that by saying:

```tag class=whatever id=foo
code
```

we could also support data attributes this way, if desired. But I want to be careful about arbitrary passthrough of attributes, since the special ones we’ve added (show, run, and display) don’t map to HTML attributes. So it’s better to think of this as a fixed set of attributes that we expose than a way to set arbitrary attributes to the generated HTML. If you really need the latter, you want an inline expression ${…} and write the tags yourself.

wiltsecarpenter added 2 commits October 26, 2023 12:10

Add { } attribute support on fenced code blocks

9a6f2a0

Merge branch 'main' of https://github.com/observablehq/cli into wilts…

2dafe82

…e/attrs

wiltsecarpenter linked an issue Oct 30, 2023 that may be closed by this pull request

Finalize syntax (e.g., ```js {show}) to denote live code blocks #28

Closed

Merge branch 'main' of https://github.com/observablehq/cli into wilts…

c9d7448

…e/attrs

wiltsecarpenter added 2 commits October 30, 2023 14:58

Add some tests and re-arrange

25a9a9e

Missed one conversion spot

3c797a3

wiltsecarpenter marked this pull request as ready for review October 30, 2023 22:23

wiltsecarpenter requested review from mbostock and cinxmo October 30, 2023 22:24

wiltsecarpenter added 4 commits November 7, 2023 09:37

Merge branch 'main' of https://github.com/observablehq/cli into wilts…

a2eb647

…e/attrs

Update to match new spec

24fe88a

Update comment

8b15610

Merge branch 'main' of https://github.com/observablehq/cli into wilts…

39af3e2

…e/attrs

wiltsecarpenter force-pushed the wiltse/attrs branch from 1984a4b to 39af3e2 Compare November 8, 2023 00:19

wiltsecarpenter and others added 3 commits November 7, 2023 16:20

Move comment

1e1ae28

Merge branch 'main' into wiltse/attrs

ac090be

lots more tests

caad30f

mbostock self-assigned this Nov 16, 2023

HTML-style info string

4b0e15e

mbostock changed the title ~~Add { } attribute support on fenced code blocks~~ Add attribute support on fenced code blocks Nov 16, 2023

fix .only

18cc36e

mbostock added 2 commits November 16, 2023 12:47

echo

50c95a6

he decode

dbec9e7

mbostock approved these changes Nov 16, 2023

View reviewed changes

mbostock added 4 commits November 16, 2023 12:55

Merge branch 'main' into wiltse/attrs

e131f5c

ambiguous ampersands!

602ef8e

push if up

94486b3

Merge branch 'main' into wiltse/attrs

c83ffad

mbostock merged commit 747e6ae into main Nov 17, 2023
1 check passed

mbostock deleted the wiltse/attrs branch November 17, 2023 05:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add attribute support on fenced code blocks #75

Add attribute support on fenced code blocks #75

wiltsecarpenter commented Oct 26, 2023 •

edited

Loading

mootari commented Oct 28, 2023

wiltsecarpenter commented Oct 30, 2023

mootari commented Oct 30, 2023

wiltsecarpenter commented Oct 30, 2023

mootari commented Oct 30, 2023

mbostock commented Oct 30, 2023 •

edited

Loading

wiltsecarpenter commented Oct 31, 2023

allisonhorst commented Oct 31, 2023

mbostock commented Oct 31, 2023

allisonhorst commented Nov 3, 2023

mbostock commented Nov 16, 2023 •

edited

Loading

mbostock commented Nov 16, 2023

wiltsecarpenter commented Nov 16, 2023

mbostock commented Nov 16, 2023 •

edited

Loading

Add attribute support on fenced code blocks #75

Add attribute support on fenced code blocks #75

Conversation

wiltsecarpenter commented Oct 26, 2023 • edited Loading

Notes from design review

References

mootari commented Oct 28, 2023

wiltsecarpenter commented Oct 30, 2023

mootari commented Oct 30, 2023

wiltsecarpenter commented Oct 30, 2023

mootari commented Oct 30, 2023

mbostock commented Oct 30, 2023 • edited Loading

wiltsecarpenter commented Oct 31, 2023

allisonhorst commented Oct 31, 2023

mbostock commented Oct 31, 2023

allisonhorst commented Nov 3, 2023

mbostock commented Nov 16, 2023 • edited Loading

mbostock commented Nov 16, 2023

wiltsecarpenter commented Nov 16, 2023

mbostock commented Nov 16, 2023 • edited Loading

wiltsecarpenter commented Oct 26, 2023 •

edited

Loading

mbostock commented Oct 30, 2023 •

edited

Loading

mbostock commented Nov 16, 2023 •

edited

Loading

mbostock commented Nov 16, 2023 •

edited

Loading