Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex caret (^) not working properly for scope: raw when there are over 999 characters in a Markdown file. #869

Open
1 task done
michael-nok opened this issue Jul 10, 2024 · 0 comments

Comments

@michael-nok
Copy link

michael-nok commented Jul 10, 2024

Check for existing issues

  • Completed

Environment

  • Windows 10
  • Direct download of Windows executable
  • Vale 2.29.1 or later

Describe the bug / provide steps to reproduce it

The changes implemented by this commit causes the problem to exist: f769fcd

	// NOTE: If the `ctx` document is large (as could be the case with
	// `scope: raw`) this is *slow*. Thus, the cap at 1k.
	//
	// TODO: Actually fix this.

I have a rule that looks for incorrectly indented content. It uses the following token:

extends: existence
message: 'Content must be indented using 4x spaces each time. "%s"'
level: error
nonword: true
scope: raw
tokens:
  - '^[ ]{1,3}\`'

When the Markdown file contains 999 characters or more (i.e. ctx > 1000), the ^ part of the token stops using the start of the line properly and invents (hallucinates) new starting positions.

Attached are sample files that exactly show the spillover in the logic:
vale.zip

Consequently, text with four spaces before the first ` is flagged as incorrect, and the starting position for the ^ is column 1.

image

Using the vale.exe from release 2.29.0 does not have this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant