Skip to content

Commit

Permalink
Add more expansive docs about algorithm
Browse files Browse the repository at this point in the history
  • Loading branch information
wooorm committed May 1, 2017
1 parent 10ec95a commit 6762764
Showing 1 changed file with 40 additions and 0 deletions.
40 changes: 40 additions & 0 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,44 @@ Transform [HAST][] to [NLCST][].

> **Note** You probably want to use [rehype-retext][].
###### Implied sentences

The algorithm supports implicit and explicit paragraphs, such as:

```html
<article>
An implicit sentence.
<h1>An explicit sentence.</h1>
</article>
```

Overlapping paragraphs are also supported (see the tests or the HTML spec
for more info).

###### Ignored nodes

Some elements are ignored and their content will not be present in NLCST:
`<script>`, `<style>`, `<svg>`, `<math>`, `<del>`.

To ignore other elements, add a `data-nlcst` attribute with a value of `ignore`:

```html
<p>This is <span data-nlcst="ignore">hidden</span>.</p>
<p data-nlcst="ignore">Completely hidden.</p>
```

###### Source nodes

`<code>` elements are mapped to [Source][] nodes in NLCST.

To mark other elements as source, add a `data-nlcst` attribute with a value
of `source`:

```html
<p>This is <span data-nlcst="source">marked as source</span>.</p>
<p data-nlcst="source">Completely marked.</p>
```

## Installation

[npm][]:
Expand Down Expand Up @@ -117,3 +155,5 @@ into an [NLCST][nlcst] tree.
[latin]: https://github.com/wooorm/parse-latin

[dutch]: https://github.com/wooorm/parse-dutch

[source]: https://github.com/syntax-tree/nlcst#source

0 comments on commit 6762764

Please sign in to comment.