Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird characters showing up #98

Closed
ghost opened this issue May 11, 2018 · 21 comments
Closed

Weird characters showing up #98

ghost opened this issue May 11, 2018 · 21 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@ghost
Copy link

ghost commented May 11, 2018

See pic below. I guess it is the lines that are not being drawn properly.
How can i fix it?

thx

1526068458

@sharkdp
Copy link
Owner

sharkdp commented May 12, 2018

Oh dear 😄

It looks like your terminal can not print the unicode characters , , , etc. - yes. Which terminal (and which font?) do you use?

I wonder if there is some way to detect this. bat could then maybe fall back to ASCII characters in this case.

@sharkdp sharkdp added the help wanted Extra attention is needed label May 12, 2018
@ghost
Copy link
Author

ghost commented May 13, 2018

i'm using termite.
I think the font I'm using is Hack. I'm on arch linux. how do I check that?

@ghost
Copy link
Author

ghost commented May 13, 2018

Nevermind, it had to do with my locale which was not set. Thx for the help!

@ghost ghost closed this as completed May 13, 2018
@sharkdp
Copy link
Owner

sharkdp commented May 13, 2018

Thank you for the feedback.

@Berjou
Copy link

Berjou commented May 13, 2018

Hi, I got the same problem with urxvt, with DejaVu Sans Mono. However, my locale is set.

$ locale
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=
$ bat --version  
bat 0.3.0

Any idea on how to fix this?
Thx.

@sharkdp sharkdp reopened this May 13, 2018
@sharkdp
Copy link
Owner

sharkdp commented May 16, 2018

Does this work for you?

printf "\xE2\x94\x82\n"

It shoud print this unicode character: .

@Berjou
Copy link

Berjou commented May 16, 2018

Yes it works.

@sharkdp
Copy link
Owner

sharkdp commented May 18, 2018

Could you please show a screenshot of

echo "\xE2\x94\x82" | bat

I cannot reproduce this with urxvt (although I did not configure the font).

@Berjou
Copy link

Berjou commented May 19, 2018

There is the result with urxvt:
2018-05-19-063534_1363x667_scrot

And there is the same result with a default xterm:
2018-05-19-063624_1361x514_scrot

@sharkdp
Copy link
Owner

sharkdp commented May 19, 2018

Thank you for the screenshots. This is weird ... I am running out of ideas here. The font does not seem to be the problem, I think.

Could you try some methods suggested in the urxvt FAQ, i.e.:

  • Make sure that LC_CTYPE is set to a UTF-8 locale when starting urxvt:
    LC_CTYPE=en_US.UTF-8 urxvt
    
  • Try to change urxvts encoding
    printf '\33]701;%s\007' en_US.UTF-8
    

@Berjou
Copy link

Berjou commented May 20, 2018

Thank you for your time. Sadly, none of these commands solved the problem.

ghuls added a commit to ghuls/bat that referenced this issue May 22, 2018
Fix launching of pager so all control characters are interpreted
instead of only ANSI "color" escape sequences.

This fixes issue sharkdp#98
@ghuls
Copy link

ghuls commented May 22, 2018

I noticed the same issue with bat 0.3.0 from crates.io.

The following works fine:

bat file.txt

This doesn't (launch bat with pager manually with same arguments as in the git version:

bat file.txt | less --RAW-CONTROL-CHARS --no-init --quit-if-one-screen

This works:

bat file.txt | less --raw-control-chars --no-init --quit-if-one-screen

Less manpage:

       -r or --raw-control-chars
              Causes "raw" control characters to be displayed.  The default is to display control characters using the caret notation; for example, a control-A (octal 001) is displayed as "^A".   Warn‐
              ing:  when  the  -r option is used, less cannot keep track of the actual appearance of the screen (since this depends on how the screen responds to each type of control character).  Thus,
              various display problems may result, such as long lines being split in the wrong place.

       -R or --RAW-CONTROL-CHARS
              Like -r, but only ANSI "color" escape sequences are output in "raw" form.  Unlike -r, the screen appearance is maintained correctly in most  cases.   ANSI  "color"  escape  sequences  are
              sequences of the form:

                   ESC [ ... m

              where  the "..." is zero or more color specification characters For the purpose of keeping track of screen appearance, ANSI color escape sequences are assumed to not move the cursor.  You
              can make less think that characters other than "m" can end ANSI color escape sequences by setting the environment variable LESSANSIENDCHARS to the list of characters which can end a color
              escape sequence.  And you can make less think that characters other than the standard ones may appear between the ESC and the m by setting the environment variable LESSANSIMIDCHARS to the
              list of characters which can appear.

Pull request: #143

@ghuls
Copy link

ghuls commented May 22, 2018

For me, less --RAW-CONTROL-CHARS was not working on this version of less:

$ less -V
less 436
Copyright (C) 1984-2009 Mark Nudelman

less comes with NO WARRANTY, to the extent permitted by law.
For information about the terms of redistribution,
see the file named README in the less distribution.
Homepage: http://www.greenwoodsoftware.com/less

On another server, we have another version of less, and there less --RAW-CONTROL-CHARS works:

$ less -V
less 458 (GNU regular expressions)
Copyright (C) 1984-2012 Mark Nudelman

less comes with NO WARRANTY, to the extent permitted by law.
For information about the terms of redistribution,
see the file named README in the less distribution.
Homepage: http://www.greenwoodsoftware.com/less

@ghuls
Copy link

ghuls commented May 22, 2018

I got the "less 436" version working with the following:

LESSCHARSET='UTF-8' less --RAW-CONTROL-CHARS --no-init --quit-if-one-screen

According to the man page of less:

If neither LESSCHARSET nor LESSCHARDEF is set, but any of the strings "UTF-8", "UTF8", "utf-8" or "utf8" is found in the LC_ALL, LC_TYPE or LANG environment variables, then the default character set is utf-8.

I have this setting in my env on that server (it should pick up the LANG setting to find out about UTF-8, but it seems to ignore it):

echo LC_ALL=$LC_ALL LC_TYPE=$LC_TYPE LANG=$LANG
LC_ALL=C LC_TYPE= LANG=en_US.UTF-8

@sharkdp
Copy link
Owner

sharkdp commented May 22, 2018

@ghuls That is a good hint! I didn't think about the possibility that less could be the culprit.

@Berjou Could you please try bat with the --paging=never option? If that works, could you try to set LESSCHARSET, as @ghuls suggested?

@ghuls I'm not sure if #143 is really the fix that we need. It doesn't seem to me that we output any "raw control characters" except for the ANSI color sequences?

ghuls added a commit to ghuls/bat that referenced this issue May 22, 2018
ghuls added a commit to ghuls/bat that referenced this issue May 22, 2018
@ghuls
Copy link

ghuls commented May 22, 2018

@sharkdp You are right about the #143 pull request. I force pushed the probably proper fix to that pull request now, so if you can check it again.

@sharkdp
Copy link
Owner

sharkdp commented May 22, 2018

@sharkdp You are right about the #143 pull request. I force pushed the probably proper fix to that pull request now, so if you can check it again.

@ghuls Thanks! You seem to have a different issue than @Berjou though, right (since bat file.txt works for you)?

@ghuls
Copy link

ghuls commented May 22, 2018

@sharkdp bat file.txt worked with the version of bat on crates.io (which did not use a pager yet) not with the git version.

@Berjou
Copy link

Berjou commented May 22, 2018

Setting the LESSCHARSET solved the issue for me. Thank you for your time.

sharkdp pushed a commit that referenced this issue May 22, 2018
@sharkdp sharkdp added the bug Something isn't working label May 22, 2018
@sharkdp
Copy link
Owner

sharkdp commented May 22, 2018

@ghuls I see. Thank you for the clarification and for #143!

@Berjou Thank you for checking!

@sharkdp sharkdp closed this as completed May 22, 2018
@sharkdp
Copy link
Owner

sharkdp commented May 31, 2018

Fixed in v0.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants