Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lints to ensure link text for EIPs should match the EIP's number #99

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

KatyaRyazantseva
Copy link

Issue 67. Covered examples with tests.

@KatyaRyazantseva
Copy link
Author

Issue #67

Copy link
Contributor

@SamWilsn SamWilsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I'd try to avoid hard coding regexes where possible. If they're hard coded, they can't be changed by the configuration file.

(
"markdown-link-other",
MarkdownLinkOther {
pattern: markdown::LinkOther(r"^(EIP|ERC)-(\d+)\s*\S*$")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be case insensitive ((?i))?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about it. EIP-1 says the text should be like EIP-N. Does it mean upper case or not? I can add case insensitive if it is so. Then eip-25 will work for links. Is it ok or it must be EIP-25 (upper case only)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have the other lint to check for incorrectly cased references. So even if [eip-1](./eip-1.md) passes markdown-link-other, it'll fail the other lint.

Or at least I think it will 🤔

Comment on lines 128 to 166
if self.link_depth > 0 {
self.link_depth = self.link_depth.checked_sub(1).unwrap();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If self.link_depth is greater than zero, the checked_sub is unnecessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked_sub in the depart_link function returns self.link_depth to 0. I check link_depth in enter_text fn. If link_depth == 0, it means the text is not from the link, so, I skip it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that if self.link_depth > 0 then self.link_depth - 1 can never underflow. The checked_sub will never fail.

Comment on lines 135 to 174
if self.link_depth > 0 {
self.current_link.text = txt.to_owned();
self.check(ast)?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this behave for content like:

[**EIP-1**5678](./eip-1.md#rationale)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fails. I will add it to the tests. Should it fail?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, perhaps I spoke too soon then 🤣

I was worried that because every text node in the link sets self.current_link.text, you could end up in situations where, for example, bold would create two text nodes, breaking the lint. Maybe something like [**EIP-1**EIP-1](./eip-1.md).

If my concerns are unfounded, ignore me!

|
4 | [EIP-1](./eip-2.md)
|
= info: link text should match `[EIP|ERC-2]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can, I'd rather the specific expected text in the message. Maybe something like:

error[markdown-link-eip]: link text does not match link destination
  |
4 | [EIP-1](./eip-2.md)
  |
  = help: `use [EIP-2](./eip-2.md)` instead

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on the help hint. I check the link and build the correct version of the text based on the link. Is it correct? So, my hints will look like this:

[EIP-1](./eip-2.md) -> [EIP-2](./eip-2.md)
[EIP-1: Foo](./eip-1.md) -> [EIP-1](./eip-1.md)
[Another Proposal](./eip-1.md) -> [EIP-1](./eip-1.md)
[EIP-1](./eip-1.md#motivation) -> [EIP-1: Motivation](./eip-1.md#motivation)
[EIP-2: Hello](./eip-1.md#abstract) -> [EIP-1: Abstract](./eip-1.md#abstract)
[Another Proposal](./eip-1.md#rationale) -> [EIP-1: Rationale](./eip-1.md#rationale)

Example:

error[markdown-link-eip]: link text does not match link destination
  |
4 | [Another Proposal](./eip-1.md#rationale) 
  |
  = help: ` [EIP-1: Rationale](./eip-1.md#rationale)` instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks great!

|
4 | [Another Proposal](./eip-1.md#rationale)
|
= info: link text should match `[EIP|ERC-1<section-description>]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar kind of comment here. The help text should (if possible) include the correct text to use. Authors might not understand how to make something "match".

@KatyaRyazantseva
Copy link
Author

Pushed updates. I tried to fix everything according to comments. Haven't found EIP-1 example in existing eips. So, for now ignoring you)) Can we create a new issue for it in case it comes later?

Hardcoded regexes in lib.rs are kind of a style for the whole file, I followed the pattern. If we need to change regexes in config files, we need to refactor it globally.

@SamWilsn
Copy link
Contributor

SamWilsn commented Jul 6, 2024

Hardcoded regexes in lib.rs are kind of a style for the whole file, I followed the pattern. If we need to change regexes in config files, we need to refactor it globally.

The configuration in lib.rs (default_lints_enum is just the defaults. Hard coding there is fine because anything there can be overridden by the config file.

Other regexes, like this one cannot be overridden.

@SamWilsn
Copy link
Contributor

SamWilsn commented Jul 6, 2024

I've added a test in 09d7de7 that demonstrates the problem I brought up in #99 (comment).

@KatyaRyazantseva
Copy link
Author

Hardcoded regexes in lib.rs are kind of a style for the whole file, I followed the pattern. If we need to change regexes in config files, we need to refactor it globally.

The configuration in lib.rs (default_lints_enum is just the defaults. Hard coding there is fine because anything there can be overridden by the config file.

Other regexes, like this one cannot be overridden.

This one is dynamic. It depends on the lib.rs pattern. I extract the number of eip there and put into a new regex. Do you have any better ideas?

@KatyaRyazantseva
Copy link
Author

fixed bold text

@KatyaRyazantseva
Copy link
Author

@SamWilsn, I've done with the fixes. Could you please review it and merge if there are no any comments?

Comment on lines +64 to +73
fn extract_capture(&self, text: &str, re: &Regex, index: usize) -> Result<String, Error> {
if let Some(captures) = re.captures(text) {
Ok(captures
.get(index)
.map(|m| m.as_str().to_string())
.unwrap_or_default())
} else {
Ok(String::new())
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function never returns an error, so you can simplify it a bit:

Suggested change
fn extract_capture(&self, text: &str, re: &Regex, index: usize) -> Result<String, Error> {
if let Some(captures) = re.captures(text) {
Ok(captures
.get(index)
.map(|m| m.as_str().to_string())
.unwrap_or_default())
} else {
Ok(String::new())
}
}
fn extract_capture(&self, text: &str, re: &Regex, index: usize) -> String {
if let Some(captures) = re.captures(text) {
captures
.get(index)
.map(|m| m.as_str().to_string())
.unwrap_or_default()
} else {
String::new()
}
}

Ok(captures
.get(index)
.map(|m| m.as_str().to_string())
.unwrap_or_default())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that if the regular expression has the wrong number of capture groups, we should inform the user instead of silently returning the empty string.

You can ignore my previous comment about simplifying, and use Error::custom and something like:

#[derive(Debug, Snafu)]
struct BadRegex;

Comment on lines +85 to +87
let url_eip_text = self.extract_capture(&self.current_link.url, &self.re, 1)?;
let url_eip_number = self.extract_capture(&self.current_link.url, &self.re, 2)?;
let url_section = self.extract_capture(&self.current_link.url, &self.re, 4)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repeats the regex search each time, which is pretty inefficient.

fn check(&self, ast: &Ast) -> Result<Next, Error> {
let url_eip_text = self.extract_capture(&self.current_link.url, &self.re, 1)?;
let url_eip_number = self.extract_capture(&self.current_link.url, &self.re, 2)?;
let url_section = self.extract_capture(&self.current_link.url, &self.re, 4)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eipw-lint already depends on url for parsing URLs. You could use it here to get the fragment (#...).

let section_description = Visitor::transform_section_description(&url_section);
format!(
"[{}{}: {}]({})",
url_eip_text.to_uppercase(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weirdly enough, ERCs are still stored in files named like eip-1234.md, so you can't use the filename to predict whether you need "EIP-..." or "ERC-..."

The correct way to solve this (reading the linked file) won't work for reasons outside eipw's scope, so... I guess the best we can do is something like:

help: use [EIP-1237](./eip-1237.md) or [ERC-1237](./eip-1237.md) instead

@KatyaRyazantseva
Copy link
Author

KatyaRyazantseva commented Aug 24, 2024

@SamWilsn I updated Rust (1.80.1 the latest stable version) and now have this issue with time crates rust-lang/rust#127343. Can't build the project anymore. Can we somehow fix it? Delete it from Cargo.lock? I haven't found it elsewhere in the project.

@SamWilsn
Copy link
Contributor

SamWilsn commented Sep 4, 2024

If you're using rustup, you can use, for example, cargo +1.69 build to pick a particular version.

I'll update the rust version in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants