-
Notifications
You must be signed in to change notification settings - Fork 196
Consider rewriting the markdown checker in rust #5405
Comments
@jodh-intel, I have some interest in this one. |
Thanks @Christopher-C-Robinson! The first step is going to be to identify a good markdown parsing crate to use. I quick look suggests these are two popular ones: If you look at the golang markdown checker, it uses the $ git clone https://github.com/kata-containers/tests
$ cd tests/cmd/check-markdown
$ grep -Er 'errors.new|fmt.Errorf' main [0]
utils.go: return "", "", fmt.Errorf("invalid link %s: expected %d fields, found %d", linkName, expectedFields, foundFields)
utils.go: return "", fmt.Errorf("need heading name")
utils.go: return fmt.Errorf("expected %v node, found %v", expectedType, node.Type)
add.go: return "", fmt.Errorf("document root %q does not exist", docRoot)
heading.go: return Heading{}, fmt.Errorf("heading name cannot be blank")
heading.go: return Heading{}, fmt.Errorf("heading markdown name cannot be blank")
heading.go: return Heading{}, fmt.Errorf("level needs to be atleast 1")
main.go: return fmt.Errorf("no handler for format %q", format)
doc.go: return fmt.Errorf("file=%q: %s", d.Name, s)
display.go: return fmt.Errorf("unknown show option: %v", what)
parse.go: return fmt.Errorf("found %d parse error%s:\n%s", A lot of what the checker does is checking markdown links between documents as we want to guarantee that all markdown links are valid. That's harder than it sounds as checking a single markdown file may link to every other markdown file in the repo (and then back to itself potentially!) Separate from markdown links are URLs in markdown docs. Validating URLs isn't the focus of the current golang tool as we actually handle that in the CI here using:
However, it would be good if eventually the rust markdown checker could handle checking URLs itself (extracting the URLs from the markdown, verifying that they are syntactically valid addresses, and then optionally connecting to the URL to determine if those URLs are valid/alive still. But that doesn't need to be done for this issue ;) One last comment: it would be a good idea to be familiar with the GitHub Flavoured Markdown spec for this piece of work: |
@Christopher-C-Robinson - Here's how I'd break this task down:
We use
Also, I believe @gabevenberg is using it for kata-containers/kata-containers#5350. As shown, there are quite a few steps so you might want to consider pairing up with someone else to work on this, or limiting the scope of what you plan to implement. We can more about this in the meeting on Friday or on Slack of course ;) |
@jodh-intel I have joined @Christopher-C-Robinson and we are teaming up to attempt to tackle this issue. |
@Toreseen - great! ;) |
As discussed with @Christopher-C-Robinson and @Toreseen today, once you've got a basic program that can iterate through all the markdown node types, you'll need to ensure it can access "heading" links and markdown links (not URLs). You might need to save these nodes to a hash or a list (a Checking markdown link in current file## Section 1
foo
## Section 2
- This is a [valid link](#section-1).
- This is an [invalid link](#invalid-section). Checking markdown link in another file
This is a link to a [different file](../docs/design/README.md#section-wibble). |
The
check-markdown
tool is used by thestatic-checks.sh
script called by the CI.Although the current golang implementation of
check-markdown
is ok, it isn't perfect as theblackfriday
package it uses seems fragile (see https://github.com/kata-containers/tests/blob/main/cmd/check-markdown/hack.go).Since we're moving everything else to rust, it might be worth considering rewriting
check-markdown
in rust too.If anyone is interested, please add a comment here before starting work on it! 😄
The text was updated successfully, but these errors were encountered: