-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Erlang application resource file extension #6297
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the generic nature of its name we may need some heuristics to distinguish this particular usage from others.
Most definitely and it will need to be 100% accurate. There is no room for misclassification of generic extensions. There are over 27k files with this extension so there is massive scope for misclassification.
If you can come up with a heuristic that precisely matches these Erlang files and only these files, you'll also need to add the extension to the generic.yml
file and add tests like for other heuristics and generic extensions.
If you can't come up with a 100% precise heuristic, it's best not to pursue this PR and let users use an override if they really want these files to be classified as Erlang.
I have included a real-world usage sample for all extensions added in this PR:
* Sample source(s): * [`otp/bootstrap/lib/kernel/ebin/kernel.app`](https://github.com/erlang/otp/blob/master/bootstrap/lib/kernel/ebin/kernel.app) * [`cowboy/ebin/cowboy.app`](https://github.com/ninenines/cowboy/blob/master/ebin/cowboy.app)
No you haven't. 😁 You've included a very "hello world" looking sample. If you're going to go ahead with this PR, please remove that sample and add the two you've referenced.
fdda8e6
to
ea05918
Compare
lib/linguist/heuristics.yml
Outdated
- extensions: ['.app'] | ||
rules: | ||
- language: Erlang | ||
pattern: '^{\s*(?:application|''application'')\s*,\s*(?:[a-z]+[a-z\d_@]*|''[\s\S]+'')\s*,\s*\[[\s\S]*\]\s*}\.$' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
The sequence
'[\s\S]+'
will match stuff like'Foo's Bar'
, which I'm assuming isn't valid Erlang (I know nothing of the language). I'm assuming you meant'.+'
where.
includes newlines as well (i.e., the "dotall" modifier typical of most regex flavours). If so, I recommend using'[^']+'
instead (which doesn't account for escape sequences, if such a thing is supported). -
Did you intend for
[a-z]
to match only lowercase characters? If not,[a-z\d_@]
can be simplified into[\w@]
, as (IIRC) GitHub's Linguist process runs without full CLDR support, meaning\w
will match its traditional POSIX definition ([[:alnum:]]
, or[A-Za-z0-9_]
). @lildude will correct me if I'm wrong, I'm sure. 😉 -
\.$
doesn't account for trailing whitespace. Unless that's a deliberate omission, I recommend using\.[ \t]*$
to match EOL.
Putting the changes proposed by all three points together, we get:
pattern: '^{\s*(?:application|''application'')\s*,\s*(?:[a-z]+[a-z\d_@]*|''[\s\S]+'')\s*,\s*\[[\s\S]*\]\s*}\.$' | |
- pattern: '^{\s*(?:application|''application'')\s*,\s*(?:[a-z]+[a-z\d_@]*|''[\s\S]+'')\s*,\s*\[[\s\S]*\]\s*}\.$' | |
+ pattern: '(?m)^{\s*(?:application|''application'')\s*,\s*(?:[a-z]+[\w@]*|''[^'']+'')\s*,\s*\[.*?\]\s*}\.[\t]*$' |
Source (readable)(?xm) ^
{ \s*
(?:
application
|
'application'
)
\s* , \s*
(?:
[a-z]+[\w@]*
|
'[^']+'
)
\s* , \s*
\[ .*? \]
\s* }
\. [ \t]* $ | Source (YAML string)'(?m)^{\s*(?:application|''application'')\s*,\s*(?:[a-z]+[\w@]*|''[^'']+'')\s*,\s*\[.*?\]\s*}\.[\t]*$' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Your assumption is correct.
- The first
[a-z]
occurrence should only match lowercase letters. However,[a-z\d_@]
can be reduced to[\w@]
since, although uncommon, that particular segment of an atom can include uppercase letters. - Not deliberate.
Thank you for your time. I have updated the code to reflect your suggestions.
e107049
to
a63f467
Compare
Description
An application resource file (
*.app
) contains an application specification which defines an Erlang application. It is officially documented in theapp
Erlang manual page. The*.app
file is similar to the*.app.src
file (which is correctly categorized by the Linguist library; see #2964) in that an*.app.src
file is used as input by various build systems to generate an*.app
file. It is, however, also common to manually write an*.app
file in which case no*.app.src
file is present in the source tree.The syntax of an
*.app
file (as that of an*.app.src
file) is standard Erlang syntax.Due to the generic nature of its name we may need some heuristics to distinguish this particular usage from others.
Checklist:
path:*.app
path:*.app language:Erlang
otp/bootstrap/lib/compiler/ebin/compiler.app
otp/bootstrap/lib/kernel/ebin/kernel.app