Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle file extension conflicts in --list-languages #1135

Merged
merged 6 commits into from
Sep 20, 2020

Conversation

Kienyew
Copy link
Contributor

@Kienyew Kienyew commented Aug 8, 2020

Fix #1076.

Now the each outputted extension should be unique to only one language. Below is the diff output of the old bat --list-languages and new bat --list-languages.

13c13
< C:c,h
---
> C:c
49c49
< GLSL:vs,fs,gs,vsh,fsh,gsh,vshader,fshader,gshader,vert,frag,geom,tesc,tese,comp,glsl
---
> GLSL:vs,gs,vsh,fsh,gsh,vshader,fshader,gshader,vert,frag,geom,tesc,tese,comp,glsl
70,71c70,71
< JavaScript:js,htc
< JavaScript (Babel):js,mjs,jsx,babel,es6,cjs
---
> JavaScript:htc
> JavaScript (Babel):js,mjs,jsx,babel,es6,cjs,*.js
92,93c92,93
< Objective-C:m,h
< Objective-C++:mm,M,h
---
> Objective-C:m
> Objective-C++:mm,M
119c119
< Ruby Haml:haml,sass
---
> Ruby Haml:haml
123c123
< Sass:sass
---
> Sass:sass,*.sass

But since #1035, some of the output is appearing in the form Lang ext, *.ext.
For instance in the C++ entry of bat --list-languages, containing both h and *.h:

...
C++                                 cpp, cc, cp, cxx, c++, C, h, hh, hpp, hxx, h++, inl, ipp, *.h
...

and likewise

...
Sass                                 sass, *.sass
...
F#                                   fs, fsi, fsx, *.fs
...
JavaScript (Babel)                   js, mjs, jsx, babel, es6, cjs, *.js
...

Should this be the desired behavior? @sharkdp

@@ -24,6 +24,12 @@ impl<'a> SyntaxMapping<'a> {
let mut mapping = Self::empty();
mapping.insert("*.h", MappingTarget::MapTo("C++")).unwrap();
mapping.insert("*.fs", MappingTarget::MapTo("F#")).unwrap();
mapping
.insert("*.js", MappingTarget::MapTo("JavaScript (Babel)"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we should be able to implement this feature without adding theses new syntax mappings. Instead of just querying the syntax_mapping, we would need to run the full syntax detection on the test.xx filename.

Copy link
Contributor Author

@Kienyew Kienyew Aug 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ops, I was just following the similar commits before and thought it would do 😅, should I revert it?
Also I doesn't quite understand what is the "full syntax detection" means, could you please do a little explanation for me?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please revert this.

By "full syntax detection", I mean: determine the syntax in the same way that bat usually does this for a normal file. This involves not just checking the syntax mapping, but also the full database of syntaxes. See src/assets.rs => HighlightingAssets::get_syntax.

let test_file = Path::new("test").with_extension(extension);
match config.syntax_mapping.get_syntax_for(test_file) {
Some(MappingTarget::MapTo(primary_lang)) => lang_name == primary_lang,
Some(MappingTarget::MapToUnknown) => true,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, it should be false.

@sharkdp
Copy link
Owner

sharkdp commented Aug 16, 2020

Thank you very much for your contribution!

Could you please add an entry to the "unreleased" section in CHANGELOG.md? The format for entries is:

- Description what has been changed, see #123 (@user)

where #123 links to the bug ticket and/or the PR and user is your username.

@sharkdp
Copy link
Owner

sharkdp commented Aug 16, 2020

But since #1035, some of the output is appearing in the form Lang ext, *.ext.
For instance in the C++ entry of bat --list-languages, containing both h and *.h:
Should this be the desired behavior?

I think we can leave this for now. The fact that the list contains h (and *.h) really means that files named h would be highlighted with the C++ syntax. This is a limitation of sublime syntax patterns (that we are partially addressing in bat through syntax_mapping.rs).

@@ -24,6 +24,12 @@ impl<'a> SyntaxMapping<'a> {
let mut mapping = Self::empty();
mapping.insert("*.h", MappingTarget::MapTo("C++")).unwrap();
mapping.insert("*.fs", MappingTarget::MapTo("F#")).unwrap();
mapping
.insert("*.js", MappingTarget::MapTo("JavaScript (Babel)"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please revert this.

By "full syntax detection", I mean: determine the syntax in the same way that bat usually does this for a normal file. This involves not just checking the syntax mapping, but also the full database of syntaxes. See src/assets.rs => HighlightingAssets::get_syntax.

@sharkdp
Copy link
Owner

sharkdp commented Sep 20, 2020

Thank you very much for the updates. How does the diff for bat --list-languages look like now?

@Kienyew
Copy link
Contributor Author

Kienyew commented Sep 20, 2020

Thank you very much for the updates. How does the diff for bat --list-languages look like now?

This is the new diff for the new bat --list-languages by diff old.txt new.txt so far.

13c13
< C:c,h
---
> C:c
49c49
< GLSL:vs,fs,gs,vsh,fsh,gsh,vshader,fshader,gshader,vert,frag,geom,tesc,tese,comp,glsl
---
> GLSL:vs,gs,vsh,fsh,gsh,vshader,fshader,gshader,vert,frag,geom,tesc,tese,comp,glsl
70c70
< JavaScript:js,htc
---
> JavaScript:htc
92,93c92,93
< Objective-C:m,h
< Objective-C++:mm,M,h
---
> Objective-C:m
> Objective-C++:mm,M
119c119
< Ruby Haml:haml,sass
---
> Ruby Haml:haml

@sharkdp sharkdp merged commit 83c7750 into sharkdp:master Sep 20, 2020
@sharkdp
Copy link
Owner

sharkdp commented Sep 20, 2020

Thank you very much!

@Kienyew Kienyew changed the title Handle file extension conflicts in --list-languages (#1076) Handle file extension conflicts in --list-languages Sep 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improved handling of file-extension conflicts
2 participants