Highlight code in Markdown files using tree-sitter and remark. Powered by tree-sitter-hast.
npm install remark-tree-sitter
or
yarn add remark-tree-sitter
This plugin uses the same mechanism and data as Atom for syntax highlighting, So to highlight a particular language, you need to either:
- Install the APM (Atom) package for that language and tell
remark-tree-sitter
to import it, using thegrammarPackages
option. (See Atom language packages) - Provide the
tree-sitter
grammar and scopeMappings manually, using the using thegrammars
option.
For more information on how this mechanism works,
check out the documentation for tree-sitter-hast
.
Any code blocks that are encountered for which there is not a matching language will be ignored.
The following example is also in the examples directory
and can be run directly from there.
It uses @atom-languages/language-typescript
to provide the TypeScript grammar and
npm install to-vfile vfile-reporter remark remark-tree-sitter remark-html @atom-languages/language-typescript
const vfile = require('to-vfile')
const report = require('vfile-reporter')
const remark = require('remark')
const treeSitter = require('remark-tree-sitter')
const html = require('remark-html')
remark()
.use(treeSitter, {
grammarPackages: ['@atom-languages/language-typescript']
})
.use(html)
.process(vfile.readSync('example.md'), (err, file) => {
console.error(report(err || file))
console.log(String(file))
})
Output:
example.md: no issues found
<pre><code class="tree-sitter language-typescript"><span class="source ts"><span class="storage type function">function</span> <span class="entity name function">foo</span><span class="punctuation definition parameters begin bracket round">(</span><span class="punctuation definition parameters end bracket round">)</span> <span class="punctuation definition function body begin bracket curly">{</span>
<span class="keyword control">return</span> <span class="constant numeric">1</span><span class="punctuation terminator statement semicolon">;</span>
<span class="punctuation definition function body end bracket curly">}</span></span></code></pre>
To use an Atom language package,
like any package you first need to install it using npm install
or yarn add
.
Unfortunately most APM packages are not made available on NPM,
so I've started to make some of them available under the NPM organization
@atom-languages
.
Here's a list of packages with which languages they provide highlighting for.
@atom-languages/language-typescript
:typescript
,tsx
(TypeScriptReact),flow
Note that options
is required, and either grammarPackages
or grammars
needs to be provided. (Both can be provided, and grammars specified in grammars
will overide those loaded in grammarPackages
).
An array of all Atom language packages that should be loaded.
Example:
remark().use(treeSitter, {
grammarPackages: ['@atom-languages/language-typescript']
})
The language names that code blocks must then use
to refer to a language is based on the filenames in the atom package.
For example the above package
has the files:
tree-sitter-flow.cson
, tree-sitter-tsx.cson
, tree-sitter-typescript.cson
...
so this will make the languages flow
, tsx
and typescript
available for use within code blocks.
If you want to make loaded languages available to use via different names,
you can use options.languageAliases
.
An object mapping language keys objects containing grammar
and scopeMappings
.
Anything specified here will overwrite the languages loaded by options.grammarPackages
.
For more information on scopeMappings, check out the documentation for tree-sitter-hast
.
Example:
See a working example at examples/example-grammars.js
.
remark().use(treeSitter, {
grammars: {
typescript: {
grammar: typescriptGrammar,
scopeMappings: typescriptScopeMappings
},
'custom-language': {
grammar: customLanguageGrammar,
scopeMappings: customLanguageScopeMappings
}
}
})
You can then use both the typescript
and custom-language
languages in code blocks:
```custom-language
some code
```
```typescript
let foo = 'bar';
```
If you want to make loaded languages available to use via different names,
you can use options.languageAliases
.
Sometimes including the full list of classes applied by the scope mappings can be too much, and you'd like to only include those that you have stylesheets for.
To do this, you can pass in a whitelist of classes that you actually care about.
Example: The following configuration...
remark().use(treeSitter, {
grammarPackages: ['@atom-languages/language-typescript'],
classWhitelist: ['storage', 'numeric']
})
...will convert the following markdown...
```typescript
function foo() {
return 1;
}
```
...to this:
<pre><code class="tree-sitter language-typescript"><span><span class="storage">function</span> foo() {
return <span class="numeric">1</span>;
}</span></code></pre>
TODO: options.languageAliases
is not implemented yet
TODO:
- Add unit tests for
grammars
option
remark-rehype
— Transform Markdown to HTMLremark-midas
— Highlight CSS code blocks with midas (rehype compatible)remark-highlight.js
— Highlight code with highlight.js (via lowlight)remark-code-frontmatter
— Extract frontmatter from markdown code blocksremark-code-extra
— Add to or transform the HTML output of code blocks (rehype compatible)rehype-highlight
— rehype plugin to highlight code (via lowlight)rehype-prism
— rehype plugin to highlight code (via refractor)rehype-shiki
— rehype plugin to highlight code with shiki