-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Full Commonmark compliance for Lists #2112
Conversation
This pull request is being automatically deployed with Vercel (learn more). 🔍 Inspect: https://vercel.com/markedjs/markedjs/4HWneXd69SCqp9wDXN2PgRHkLKHJ |
I believe |
Ok that makes sense. I could see that the I could use some help figuring out the rule for
becomes
Same with
becomes
|
For Or... after making either of those changes, this test might be redundant since I think it overlaps with existing tests everywhere... |
Ok, thanks for tracking that down. I have the logic from those issues working, but I still am having trouble with the alignment/indentation rules for those two test cases. Can you help me understand when a pedantic list considers something a sublist versus not? The daring fireball spec does not say anything about when to nest sublists that I can see.
|
The original spec was very lax on details because it assumed well formatted markdown, so bad markdown should give unspecified results (garbage in, garbage out). Unfortunately people who write bad markdown still want consistency. The easiest way to get around it is to have |
That seems to be the case.
yes |
Right right. So I think you saying we should match the dingus as close as possible even if it's garbage, right? My question then was an ask for help in reverse engineering the dingus logic so I can get the same garbage in garbage out for consistency, as you say. Then I can work out where and what to put into the |
@UziTech Unfortunately it doesn't follow Pedantic rules either. The problem area is this:
The test wants it to become
However, an indented code block cannot interrupt a paragraph without a blank line before, both in CommonMark and in Pedantic. So both Commonmark and Pedantic dingus give this:
My vote is to just remove this test altogether since it is an unwieldy, large, test that don't seem to cover anything that isn't already covered by smaller tests. |
Sounds good it looks like it was created well before my time so most likely it is out of date. I would be better to keep the tests specific anyway. |
Tadaa! Passing all spec tests now, plus 3 more Commonmark examples. Unfortunately there are a bunch of Unit tests that are expecting tokens to look different now so that needs to be cleaned up. Essentially lists no longer consume blank newlines at the end, so you end up with |
And now unit tests are all passing. I'm going to see if I can knock out the last commonmark specs, but I think this is in a good spot to review. I also have a couple ideas to speed this up slightly but I want to see if I can get the last examples working first. |
src/Lexer.js
Outdated
lastToken.raw += '\n' + token.raw; | ||
lastToken.text += '\n' + token.raw; | ||
} else { | ||
if (!this.tokens.links[token.tag]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit pick: this could be changed to else if
instead of nested else
and if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thanks!
Reminder to not merge this quite yet. Now that #2124 is approved I want to look over this again because it needs to be converted over to Lex its own child tokens in that new format. |
I merged #2124 so you should be able to rebase this and make the changes for lexing the child tokens in the tokenizer. |
@UziTech .....sigh.... It is done.... I don't know why but that was a nightmare. |
# [3.0.0](v2.1.3...v3.0.0) (2021-08-16) ### Bug Fixes * Add module field to package.json ([#2143](#2143)) ([edc2e6d](edc2e6d)) * drop node 10 support ([#2157](#2157)) ([433b16f](433b16f)) * Full Commonmark compliance for Lists ([#2112](#2112)) ([eb33d3b](eb33d3b)) * Refactor table tokens ([#2166](#2166)) ([bc400ac](bc400ac)) ### BREAKING CHANGES * - `table` tokens `header` property changed to contain an array of objects for each header cell with `text` and `tokens` properties. - `table` tokens `cells` property changed to `rows` and is an array of rows where each row contains an array of objects for each cell with `text` and `tokens` properties. v2: ```json { "type": "table", "align": [null, null], "raw": "| a | b |\n|---|---|\n| 1 | 2 |\n", "header": ["a", "b"], "cells": [["1", "2"]], "tokens": { "header": [ [{ "type": "text", "raw": "a", "text": "a" }], [{ "type": "text", "raw": "b", "text": "b" }] ], "cells": [[ [{ "type": "text", "raw": "1", "text": "1" }], [{ "type": "text", "raw": "2", "text": "2" }] ]] } } ``` v3: ```json { "type": "table", "align": [null, null], "raw": "| a | b |\n|---|---|\n| 1 | 2 |\n", "header": [ { "text": "a", "tokens": [{ "type": "text", "raw": "a", "text": "a" }] }, { "text": "b", "tokens": [{ "type": "text", "raw": "b", "text": "b" }] } ], "rows": [ { "text": "1", "tokens": [{ "type": "text", "raw": "1", "text": "1" }] }, { "text": "2", "tokens": [{ "type": "text", "raw": "2", "text": "2" }] } ] } ``` * Add module field to package.json * drop node 10 support
🎉 This PR is included in version 3.0.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Marked version: 2.1.1
Markdown flavor: CommonMark
Description
A reworking of the Block Lists tokenizer. Also adjusting some of the
New
unit tests because they are not accurate to the actual Commonmark spec and dingus results.Fixes Commonmark Examples 232, 234, 243, 244, 248, 250, 254, 276, 277, 287, 288, 289. This now passes all Lists and all List Items Commonmark tests!
Fixes some gitihub issues probably. Need to dig through and see.
This seems to be about the same speed as Master or slightly slower, but it's always hard to tell.
Side note, should we move the
New
tests that feature thepedantic
option into theOriginal
folder to be with all the other pedantic tests?Contributor
Committer
In most cases, this should be a different person than the contributor.