Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bat language coloring detection seems to be case sensitive on Windows #2520

Closed
SlyFoxHayes opened this issue Mar 30, 2023 · 3 comments
Closed
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@SlyFoxHayes
Copy link

I have a nuget config file, named NuGet.Config.

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <solution>
    <add key="disableSourceControlIntegration" value="true" />
  </solution>
</configuration>

I also have the .config extension mapped to XML in my bat config file:
--map-syntax='*.config:XML'

With the command bat NuGet.Config, no coloring is displayed.
With the command bat NuGet.config, coloring is displayed.

If I rename the file so that the extension is lowercase (NuGet.config), the color/nocolor behavior is the same:

With the command bat NuGet.Config, no coloring is displayed.
With the command bat NuGet.config, coloring is displayed.

What steps will reproduce the bug?

  1. use bat to display a file where the extension is either not all lowercase, or not all lowercase was typed on the command line.

What happens?

The file is displayed without syntax coloring.

What did you expect to happen instead?

The file is displayed with syntax coloring.

How did you install bat?

chocolatey


bat version and environment

Software version

bat 0.22.1 (e5d9579)

Operating system

Windows 6.2.9200

Command-line

C:\ProgramData\chocolatey\lib\Bat\tools\bat-v0.22.1-x86_64-pc-windows-msvc\bat.exe NuGet.Config --diagnostic

Environment variables

SHELL=<not set>
PAGER=<not set>
LESS=<not set>
LANG=<not set>
LC_ALL=<not set>
BAT_PAGER="less -R -F"
BAT_CACHE_PATH=<not set>
BAT_CONFIG_PATH=x:\bin\bat.shconfig
BAT_OPTS=<not set>
BAT_STYLE=<not set>
BAT_TABS=<not set>
BAT_THEME="Solarized (dark)"
XDG_CONFIG_HOME=<not set>
XDG_CACHE_HOME=<not set>
COLORTERM=truecolor
NO_COLOR=<not set>
MANPAGER=<not set>

System Config file

Could not read contents of 'C:\ProgramData\bat\config': The system cannot find the path specified. (os error 3).

Config file

# This is `bat`s configuration file. Each line either contains a comment or
# a command-line option that you want to pass to `bat` by default. You can
# run `bat --help` to get a list of all possible configuration options.

# Specify desired highlighting theme (e.g. "TwoDark"). Run `bat --list-themes`
# for a list of all available themes
## env BAT_THEME is used
##--theme='Solarized (dark)'

# Enable this to use italic text on the terminal. This is not supported on all
# terminal emulators (like tmux, by default):
--italic-text='always'

# Uncomment the following line to disable automatic paging: --paging=never

# Uncomment the following line if you are using less version >= 551 and want to
# enable mouse scrolling support in `bat` when running inside tmux. This might
# disable text selection, unless you press shift. #--pager="less --RAW-CONTROL-CHARS --quit-if-one-screen --mouse"
## env BAT_PAGER is used
##--pager "less -R -F"

# Syntax mappings: map a certain filename pattern to a language.
#   Example 1: use the C++ syntax for Arduino .ino files
#   Example 2: Use ".gitignore"-style highlighting for ".ignore" files
#--map-syntax "*.ino:C++"
--map-syntax '.ignore:Git Ignore'
--map-syntax '*.*ignore:Git Ignore'
--map-syntax '*.shconfig:Bourne Again Shell (bash)'
--map-syntax='*.conf:INI'
--map-syntax='*.config:XML'
--map-syntax='*.btm:Batch File'
## Visual Studio
--map-syntax='*.props:XML'
--map-syntax='*.targets:XML'
--map-syntax='*.csproj:XML'
--map-syntax='*.csproj.user:XML'
--map-syntax='*.vbproj:XML'
--map-syntax='*.vbproj.user:XML'
--map-syntax='*.vcxproj:XML'
--map-syntax='*.vcxproj.filters:XML'
--map-syntax='*.vcxproj.filters.user:XML'
## Data Comm
--map-syntax='*.soc:JSON'
--map-syntax='*.eipx:XML'


# https://github.com/sharkdp/bat
# https://github.com/sharkdp/bat/blob/master/doc/alternatives.md

### Integration with 4nt ###
# in 4start.btm, I documented the original aliases for 'list' and 'type'.
# both have been now aliased to use bat.
# There is no no difference between  'type' and 'list'

### Integration with code ###
# This file was made a .shconfig, for shell config.
# Generally, .config files are xml files, and code was complaining about formatting, and I always had to remap the syntax to shellscript
# Code now has a mapping for .shconfig to shell script.

Custom assets metadata

Could not read contents of 'C:\Users\thayes\AppData\Local\bat\metadata.yaml': The system cannot find the path specified. (os error 3).

Custom assets

'C:\Users\thayes\AppData\Local\bat' not found

Compile time information

  • Profile: release
  • Target triple: x86_64-pc-windows-msvc
  • Family: windows
  • OS: windows
  • Architecture: x86_64
  • Pointer width: 64
  • Endian: little
  • CPU features: fxsr,sse,sse2
  • Host: x86_64-pc-windows-msvc

Less version

> C:\ProgramData\chocolatey\bin\less.exe --version
less 608 (Spencer V8 regular expressions)
Copyright (C) 1984-2022  Mark Nudelman

less comes with NO WARRANTY, to the extent permitted by law.
For information about the terms of redistribution,
see the file named README in the less distribution.
Home page: https://greenwoodsoftware.com/less
@SlyFoxHayes SlyFoxHayes added the bug Something isn't working label Mar 30, 2023
@SlyFoxHayes SlyFoxHayes changed the title bat language coloring detection seems to be case sensitive on windows bat language coloring detection seems to be case sensitive on Windows Mar 30, 2023
@keith-hall
Copy link
Collaborator

Thanks for reporting. I wanted to say that I thought we had fixed this (#1807 / #2181) but as the fix was in syntect, it probably doesn't apply for bat's --map-syntax.

@keith-hall keith-hall added the good first issue Good for newcomers label Mar 30, 2023
AmbryN pushed a commit to AmbryN/bat that referenced this issue Aug 28, 2023
@AmbryN
Copy link
Contributor

AmbryN commented Aug 28, 2023

Hello !

I have reproduced the bug on Windows and I can confirm the --map-syntax is indeed case sensitive.

After digging through the code in syntax_mapping.rs, I could find that the inserted mapping are case sensitive by default:

pub fn insert(&mut self, from: &str, to: MappingTarget<'a>) -> Result<()> {
    let glob = GlobBuilder::new(from)
        .case_insensitive(false)
        .literal_separator(true)
        .build()?;
    self.mappings.push((glob.compile_matcher(), to));
    Ok(())
}

This toggles the boolean value in GlobOptions in the GlobBuilder which in turns produces a Glob with the same options and a regex pattern built with or without an added (?i) (case insensivity):

Ok(Glob {
    glob: self.glob.to_string(),
    re: tokens.to_regex_with(&self.opts),
    opts: self.opts,
    tokens: tokens,
})
fn to_regex_with(&self, options: &GlobOptions) -> String {
    let mut re = String::new();
    re.push_str("(?-u)");
    if options.case_insensitive {
        re.push_str("(?i)");
    }
    re.push('^');
    // Special case. If the entire glob is just `**`, then it should match
    // everything.
    if self.len() == 1 && self[0] == Token::RecursivePrefix {
        re.push_str(".*");
        re.push('$');
        return re;
    }
    self.tokens_to_regex(options, &self, &mut re);
    re.push('$');
    re
}

A first approach to fix this would be to simply change the default behavior to be .case_insensitive(true): would this have an impact on particular matching rules that absolutely need to be case sensitive ?

If this is not possible due to some matching needing case sensitivity, it would require to add a an option which needs to be picked up in the config file to tell if a matching rule needs to be case insensitive.

I have implemented the first one and tested it on Windows: it works as expected. I ran all tests and integration tests which all passed: I didn't see how I could add one for the specific change though.

I will open a pull request with that change if there is no need to catter to specific case sensitive mapping.

AmbryN pushed a commit to AmbryN/bat that referenced this issue Sep 14, 2023
@keith-hall
Copy link
Collaborator

the PR fixing this was merged a while ago, so I think this can be safely closed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants