Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add colour to PR Template and tweak CONTRIBUTING and README files #6312

Merged
merged 9 commits into from
Mar 15, 2023
8 changes: 6 additions & 2 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@

## Checklist:
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->
<!--- Feel free to remove whole sections, not points within the sections, that do not apply -->
<!--- Please remove whole sections, not points within the sections, that do not apply -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->
- [ ] **I am associating a language with a new file extension.**
- [ ] **I am adding a new extension to a language.**
- [ ] The new extension is used in hundreds of repositories on github.com
- Search results for each extension:
<!-- Replace FOOBAR with the new extension, and KEYWORDS with keywords unique to the language. Repeat for each extension added. -->
Expand All @@ -28,6 +28,10 @@
- [URL to each sample source, if applicable]
- Sample license(s):
- [ ] I have included a syntax highlighting grammar: [URL to grammar repo]
<!-- Setting a color is strongly recommended, but optional: `#cccccc` is used by default -->
- [ ] I have added a color
- Hex value: `#RRGGBB`
- Rationale: <!-- Please specify why you chose this color (if it was randomly selected, please say so); it helps arbitrate future requests to change a language's color -->
- [ ] I have updated the heuristics to distinguish my language from others using the same extension.

- [ ] **I am fixing a misclassified language**
Expand Down
46 changes: 24 additions & 22 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ These components have their own dependencies - `icu4c`, and `cmake` and `pkg-con
On macOS with [Homebrew](http://brew.sh/) the instructions below under Getting started will install these dependencies for you.

On Ubuntu:

```bash
apt-get install cmake pkg-config libicu-dev docker.io ruby ruby-dev zlib1g-dev build-essential libssl-dev
```
Expand Down Expand Up @@ -59,20 +60,22 @@ To add support for a new extension:
1. Add your extension to the language entry in [`languages.yml`][languages].
Keep the extensions in alphabetical order, sorted case-sensitively (uppercase before lowercase).
The exception is the primary extension: it should always be first.
1. Add at least one sample for your extension to the [samples directory][samples] in the correct subdirectory.
2. Add at least one sample for your extension to the [samples directory][samples] in the correct subdirectory.
We prefer examples of real-world code showing common usage.
The more representative of the structure of the language, the better.
1. Open a pull request, linking to a [GitHub search result][search-example] showing in-the-wild usage.

**"Hello world" examples will not be accepted.**
3. Open a pull request, linking to a [GitHub search result][search-example] showing in-the-wild usage.
If you are adding a sample, please state clearly the license covering the code.
If possible, link to the original source of the sample.
If you wrote the sample specifically for the PR and are happy for it to be included under the MIT license that covers Linguist, you can state this instead.

Additionally, if this extension is already listed in [`languages.yml`][languages] and associated with another language, then sometimes a few more steps will need to be taken:
Additionally, if this extension is already listed in [`languages.yml`][languages] and associated with another language, then a few more steps will need to be taken:

1. Make sure that at least two example `.yourextension` files are present in the [samples directory][samples] for each language that uses `.yourextension`.
2. If the two languages look vaguely similar, or one of the languages has uniquely identifiable characteristics, consider writing a [heuristic][] to help with the classification.

1. Make sure that example `.yourextension` files are present in the [samples directory][samples] for each language that uses `.yourextension`.
1. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.yourextension` files (ping **@lildude** to help with this).
This ensures we're not misclassifying files.
1. If the Bayesian classifier does a bad job with the sample `.yourextension` files then a [heuristic][] may need to be written to help.
Remember, the goal here is to try and avoid false positives!

See [My Linguist PR has been merged but GitHub doesn't reflect my changes][merged-pr] for details on when your changes will appear on GitHub after your PR has been merged.

Expand All @@ -86,26 +89,31 @@ To add support for a new language:

1. Add an entry for your language to [`languages.yml`][languages].
Omit the `language_id` field for now.
1. Add a syntax-highlighting grammar for your language using:
2. Add a syntax-highlighting grammar for your language using:

```bash
script/add-grammar https://github.com/JaneSmith/MyGrammar
```

This command will analyze the grammar and, if no problems are found, add it to the repository.
If problems are found, please report them to the grammar maintainer as you will otherwise be unable to add it.

**Please only add grammars that have [one of these licenses][licenses].**
1. Add samples for your language to the [samples directory][samples] in the correct subdirectory.
1. Generate a unique ID for your language by running `script/update-ids`.
1. Open a pull request, linking to [GitHub search results][search-example] showing in-the-wild usage.
3. Add samples for your language to the [samples directory][samples] in the correct subdirectory.
We prefer examples of real-world code showing common usage.
The more representative of the structure of the language, the better.

**"Hello world" examples will not be accepted.**
4. Generate a unique ID for your language by running `script/update-ids`.
5. Open a pull request, linking to [GitHub search results][search-example] showing in-the-wild usage.
Please state clearly the license covering the code in the samples.
Link directly to the original source if possible.
If you wrote the sample specifically for the PR and are happy for it to be included under the MIT license that covers Linguist, you can state this instead.

In addition, if your new language defines an extension that's already listed in [`languages.yml`][languages] (such as `.foo`) then sometimes a few more steps will need to be taken:
In addition, if your new language defines an extension that is already listed in [`languages.yml`][languages] and associated with another language, then a few more steps will need to be taken:

1. Make sure that example `.foo` files are present in the [samples directory][samples] for each language that uses `.foo`.
1. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.foo` files (ping **@lildude** to help with this).
This ensures we're not misclassifying files.
1. If the Bayesian classifier does a bad job with the sample `.foo` files, then a [heuristic][] may need to be written to help.
1. Make sure that at least two example `.foo` files are present in the [samples directory][samples] for each language that uses `.foo`.
2. If the two languages look vaguely similar, or one of the languages has uniquely identifiable characteristics, consider writing a [heuristic][] to help with the classification.

Remember, the goal here is to try and avoid false positives!

Expand All @@ -119,7 +127,6 @@ This process can help differentiate between, for example, `.h` files which could

Misclassifications can often be solved by either adding a new filename or extension for the language or adding more [samples][] to make the classifier smarter.


## Fixing syntax highlighting

Syntax highlighting in GitHub is performed using TextMate-compatible grammars.
Expand Down Expand Up @@ -198,23 +205,18 @@ Here's our current build status: [![Actions Status](https://github.com/github/li
Linguist is maintained with :heart: by:

- **@Alhadis**
- **@larsbrinkhoff**
- **@lildude** (GitHub staff)
- **@pchaigno**
lildude marked this conversation as resolved.
Show resolved Hide resolved

As Linguist is a production dependency for GitHub we have a couple of workflow restrictions:

- Anyone with commit rights can merge Pull Requests provided that there is a :+1: from a GitHub staff member.
- Releases are performed by GitHub staff so we can ensure github.com always stays up to date with the latest release of Linguist and there are no regressions in production.


[grammars]: /vendor/README.md
[heuristic]: https://github.com/github/linguist/blob/master/lib/linguist/heuristics.yml
[languages]: /lib/linguist/languages.yml
[licenses]: https://github.com/github/linguist/blob/9b1023ed5d308cb3363a882531dea1e272b59977/vendor/licenses/config.yml#L4-L15
[new-issue]: https://github.com/github/linguist/issues/new
[samples]: /samples
[search-example]: https://github.com/search?utf8=%E2%9C%93&q=extension%3Aboot+NOT+nothack&type=Code&ref=searchresults
[gpr]: https://docs.github.com/packages/using-github-packages-with-your-projects-ecosystem/configuring-rubygems-for-use-with-github-packages
[#5756]: https://github.com/github/linguist/issues/5756
[merged-pr]: /docs/troubleshooting.md#my-linguist-pr-has-been-merged-but-gitHub-doesnt-reflect-my-changes
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,6 @@

[![Actions Status](https://github.com/github/linguist/workflows/Run%20Tests/badge.svg)](https://github.com/github/linguist/actions)

[issues]: https://github.com/github/linguist/issues
[new-issue]: https://github.com/github/linguist/issues/new

This library is used on github.com to detect blob languages, ignore binary or vendored files, suppress generated files in diffs, and generate language breakdown graphs.

## Documentation
Expand All @@ -30,6 +27,7 @@ Accordingly, we highly recommend you install a version of Ruby using Homebrew, `

Linguist uses [`charlock_holmes`](https://github.com/brianmario/charlock_holmes) for character encoding and [`rugged`](https://github.com/libgit2/rugged) for libgit2 bindings for Ruby.
These components have their own dependencies.

1. charlock_holmes
* cmake
* pkg-config
Expand Down Expand Up @@ -95,6 +93,7 @@ $ github-linguist
#### Additional options

##### `--rev REV`

The `--rev REV` flag will change the git revision being analyzed to any [gitrevisions(1)](https://git-scm.com/docs/gitrevisions#_specifying_revisions) compatible revision you specify.

This is useful to analyze the makeup of a repo as of a certain tag, or in a certain branch.
Expand All @@ -118,12 +117,14 @@ $ github-linguist jekyll
```

And here is Jekyll's published website, from the gh-pages branch inside their repository.

```console
$ github-linguist jekyll --rev origin/gh-pages
100.00% 2568354 HTML
```

##### `--breakdown`

The `--breakdown` or `-b` flag will additionally show the breakdown of files by language.

You can try running `github-linguist` on the root directory in this repository itself:
Expand All @@ -149,6 +150,7 @@ lib/linguist.rb
```

##### `--json`

The `--json` or `-j` flag output the data into JSON format.

```console
Expand All @@ -157,6 +159,7 @@ $ github-linguist --json
```

This option can be used in conjunction with `--breakdown` to get a full list of files along with the size and percentage data.

```console
$ github-linguist --breakdown --json
{"Dockerfile":{"size":1212,"percentage":"0.31","files":["Dockerfile","tools/grammars/Dockerfile"]},"Ruby":{"size":264519,"percentage":"66.84","files":["Gemfile","Rakefile","bin/git-linguist","bin/github-linguist","ext/linguist/extconf.rb","github-linguist.gemspec","lib/linguist.rb",...]}}
Expand Down Expand Up @@ -213,7 +216,6 @@ lib/linguist.rb

Please check out our [contributing guidelines](CONTRIBUTING.md).


## License

The language grammars included in this gem are covered by their repositories' respective licenses.
Expand Down
10 changes: 5 additions & 5 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ If the language stats bar is reporting a language that you don't expect:
1. Click on the name of the language in the stats bar to see a list of the files that are identified as that language.
Keep in mind this performs a search so the [code search restrictions][search-limits] may result in files identified in the language statistics not appearing in the search results.
[Installing Linguist locally](/README.md/#installation) and running it from the [command line](/README.md#command-line-usage) will give you accurate results.
1. If you see files that you didn't write in the search results, consider moving the files into one of the [paths for vendored code](/lib/linguist/vendor.yml), or use the [manual overrides](/docs/overrides.md) feature to ignore them.
1. If the files are misclassified, search for [open issues](https://github.com/github/linguist/issues) to see if anyone else has already reported the issue.
2. If you see files that you didn't write in the search results, consider moving the files into one of the [paths for vendored code](/lib/linguist/vendor.yml), or use the [manual overrides](/docs/overrides.md) feature to ignore them.
3. If the files are misclassified, search for [open issues](https://github.com/github/linguist/issues) to see if anyone else has already reported the issue.
Any information you can add, especially links to public repositories, is helpful.
You can also use the [manual overrides](/docs/overrides.md) feature to correctly classify them in your repository.
1. If there are no reported issues of this misclassification, [open an issue](https://github.com/github/linguist/issues/new) and include a link to the repository or a sample of the code that is being misclassified.
4. If there are no reported issues of this misclassification, [open an issue](https://github.com/github/linguist/issues/new) and include a link to the repository or a sample of the code that is being misclassified.

[search-limits]: https://docs.github.com/github/searching-for-information-on-github/searching-code#considerations-for-code-search

Expand All @@ -31,8 +31,8 @@ Linguist does not consider [vendored code](/docs/overrides.md#vendored-code), [g
If the language statistics bar is not showing your language at all, it could be for a few reasons:

1. Linguist doesn't know about your language.
1. The extension you have chosen is not associated with your language in [`languages.yml`](/lib/linguist/languages.yml).
1. All the files in your repository fall into one of the categories listed above that Linguist excludes by default.
2. The extension you have chosen is not associated with your language in [`languages.yml`](/lib/linguist/languages.yml).
3. All the files in your repository fall into one of the categories listed above that Linguist excludes by default.

If Linguist doesn't know about the language or the extension you're using, consider [contributing](/CONTRIBUTING.md) to Linguist by opening a pull request to add support for your language or extension.
For everything else, you can use the [manual overrides](/docs/overrides.md) feature to tell Linguist to include your files in the language statistics.
Expand Down