-
Notifications
You must be signed in to change notification settings - Fork 8
feat: PRCE-compliant RegEx updates for GitHub Linguist #57
Conversation
@randi274 Any updates on this? TY |
@@ -1914,15 +1914,15 @@ | |||
<dict> | |||
<key>begin</key> | |||
<string>(?x) | |||
(?<type-name> | |||
(?<type_name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumb question: should everything between a <string></string>
be changed from foo-bar
to foo_bar
? (or is it only that which starts with (?x)
?)
If yes, what about line 1909, line 1899, etc...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@muenzpraeger can you help answer this, given the update needs for Github Linguist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no dumb questions. ;-) Especially not with this change. It took me a while to find this small statement in the PCRE docs for named capture groups.
When PCRE2_UTF is not set, they may contain only ASCII alphanumeric characters and underscores, but must start with a non-digit.
So yes, the change only applies to (?<the_name>)
.
Also the source of the changes is src/apex.tmLanguage.yml, which is the source definition for this TextMate grammar. Looking at that will make it clearer.
The grammar outputs in the grammars folder are auto-generated via gulp.
(?: | ||
(?: | ||
(?:(?<identifier>@?[_[:alpha:]][_[:alnum:]]*)\s*\:\:\s*)? # alias-qualification | ||
(?<name-and-type-args> # identifier + type arguments (if any) | ||
(?<name_and_type_args> # identifier + type arguments (if any) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another dumb questions: Should alias-qualification
on line 1920 be alias_qualification
?
(also on line 1993, and others)
Or is that a comment, and it doesn't matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@muenzpraeger same question here
@randi274 Thank you for doing this. It will make my port to a new version of highlightjs-apex much easier. There are some extraneous bits in the regex (or there were - I haven't checked lately) that we can't do in Apex; as I get reacquainted with this codebase I'll raise a PR or an issue to remove those... which may be a bad idea, but I figure why not see what you and the hive mind think? Thanks again. It's great to see how Apex is defined as a syntax. |
What does this PR do?
This PR updates certain RegEx elements to be PCRE RegEx compliant (specifically: changing
-
to_
in named capture group identifiers).The change is in this commit.
This is needed so that we can get the tmLanguage definition into GitHub's Linguist for language detection and syntax highlighting.
The other changes outside of the above linked commit are either dependency updates or reformatting of files (b/c of changed Prettier behavior).
What issues does this PR fix or reference?
#54