-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add B4X #6965
Add B4X #6965
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
Note: this PR will not be merged until close to when the next release is made. See here for more details.
This commit moves the check for BOM at the start of the file and fixes a potential problem of compatibility with re2. Note that `{3}?` in re2 is interpreted as matching the previous token exactly 3 times exactly while the Oniguruma engine interprets this as matching 3 or 0 times.
Sorry for the many edits post-review, I'm done now. I just wanted to make the regex simpler, but then realized that there might be a problem with the fact that Linguist is matching with a ASCII_8BIT encoded string, but some other ports of Linguist might be matching in UTF-8 mode. Hence, it's safer to express the BOM as Related: |
Description
Closes #6944
Notes on the regex used:
(?:.*(?:\r?\n|\r)){0,9}
is used to limit our search to the first 10 lines.\A\W{0,3}
is there in case the file has the UTF-8 BOM which is represented by 3 non-alphanumeric characters. (More than 70% of B4X files have the BOM)Checklist:
The extension of the new language is used in hundreds of repositories on github.com.
I have included a real-world usage sample for all extensions added in this PR:
[ ] I have included a syntax highlighting grammar: [URL to grammar repo]Using existing VBA grammarI have added a color
#00e4ff
I have updated the heuristics to distinguish my language from others using the same extension.