Chinese character displaying wrong in 'less' command #2592

d270624 · 2019-12-03T07:10:06Z

jerch · 2019-12-03T12:11:30Z

This is not a sufficient issue description - we need at least the following:

What is wrong with the displayed characters? Wrong width? Wrong chars?
What is the expected output?
Minimal example to repro it?

As a bonus give us some hints about used environment (OS version + how the data ends up in xterm.js) as there are often hidden encoding issue at these steps.

d270624 · 2019-12-04T02:04:03Z

OS version: centos 7.2
xterm.js version: latest
browser: Chrome 78.0.3904.108 x86 on macOS 10.14.6

Wrong width.
Displaying double characters but only single one actually.
1. less xxxx.txt
2. write / and then input Chinese character to find something.

Tyriar · 2019-12-04T17:29:50Z

Works fine for me:

jerch · 2019-12-04T17:48:08Z

@d270624 Please check/make sure, that both systems use an UTF8 locale. Furthermore please copy the characters in question here either as unicode chars or their hex values, so we can check to which unicode plane they belong (There are newer CJK codepoints that might be handled wrong by our current wcwidth).
If you are unsure how to grab the unicode chars, do the following in python, which gives the correct utf8 byte sequences:

print(repr(u'<your char>'.encode('utf-8')))

and copy the output into a comment.

d270624 · 2019-12-05T02:35:09Z

@jerch @Tyriar When I access the remote centos server, I execute the 'less' command in the server and the problem occurs. The server system is an utf-8 environment, which is suspected to be caused by a remote transmission protocol. This is also the case with remote servers on vscode.

d270624 · 2019-12-05T02:46:02Z

jerch · 2019-12-05T10:21:40Z

@d270624 Does it output the correct chars with correct alignment if you do it locally (without ssh into centOS)?

Can you run the same (locally + with ssh into foreign machine) in the xterm.js demo and check if the issue remains? If so, plz switch logLevel to 'debug' and post the output of console.log here.

Edit: Just saw that this happens within less. Does the shell prompt input work correctly (locally and remote)?

d270624 · 2019-12-06T01:23:54Z

@jerch It is normal when executed locally,
This problem only occurs when executing 'less' on a remote server

jerch · 2019-12-06T11:39:43Z

@d270624 Then this is most likely an encoding issue of the data on the way to xterm.js (either less itself, the ssh tunnel, docker?). Since you dont show the debug input/output I cannot help you to track this down. For general encoding issues plz have a look at https://xtermjs.org/docs/guides/encoding/.

jerch · 2019-12-06T11:53:32Z

@d270624 Found a way to repro it, stay tuned...

jerch · 2019-12-06T12:45:11Z

Ok here we go. Inputting '好' at the search line of less the following is sent to the terminal:

local (Ubuntu 18, Unicode 10): 'CSI K 好 \b \b 好'
remote (Ubuntu 16, Unicode 8): 'CSI K 好 \b 好'

Meaning:

erase everything in line right of the cursor (CSI K)
write char '好'
erase one cell backwards (\b)
erase one cell backwards (\b) - only on Ubuntu 18
write char '好' again

So the only difference here is the cursor back moving - one cell remotely, two cells locally. This pretty much looks like the systems dont agree on the wcwidth of that char and indeed - if I run less locally on that remote machine, any emulator shows the same issue with messed up input at the search line in less.

The bottom line here is - those are incompatibilities between different Unicode/wcwidth versions used by systems. While one system thinks this is a half width char covering one cell, the other sees it as wide char covering 2 cells. Since the terminal interface has no way to level out this currently, it cannot be fixed. Not sure why less does this weird print+erase+print in the first place, but the different cell widths is the reason why we see one vs two erase commands.

Still there is one bug in xterm.js linked to this issue - we do not erase full width chars properly as described here #1779. Thats the reason why the char ends up doubled partly overdrawn.

jerch added the needs more info label Dec 3, 2019

d270624 closed this as completed Dec 5, 2019

d270624 reopened this Dec 5, 2019

jerch mentioned this issue Dec 6, 2019

Better handling of fullwidth chars #1779

Closed

jerch mentioned this issue Dec 21, 2019

Fullwidth handling in buffer writes #2644

Merged

Tyriar added this to the 4.4.0 milestone Dec 25, 2019

jerch closed this as completed in #2644 Dec 26, 2019

Tyriar added type/bug Something is misbehaving and removed needs more info labels Jan 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chinese character displaying wrong in 'less' command #2592

Chinese character displaying wrong in 'less' command #2592

d270624 commented Dec 3, 2019

jerch commented Dec 3, 2019

d270624 commented Dec 4, 2019 •

edited

Loading

Tyriar commented Dec 4, 2019

jerch commented Dec 4, 2019

d270624 commented Dec 5, 2019 •

edited

Loading

d270624 commented Dec 5, 2019 •

edited

Loading

jerch commented Dec 5, 2019 •

edited

Loading

d270624 commented Dec 6, 2019 •

edited

Loading

jerch commented Dec 6, 2019

jerch commented Dec 6, 2019

jerch commented Dec 6, 2019 •

edited

Loading

Chinese character displaying wrong in 'less' command #2592

Chinese character displaying wrong in 'less' command #2592

Comments

d270624 commented Dec 3, 2019

jerch commented Dec 3, 2019

d270624 commented Dec 4, 2019 • edited Loading

Tyriar commented Dec 4, 2019

jerch commented Dec 4, 2019

d270624 commented Dec 5, 2019 • edited Loading

d270624 commented Dec 5, 2019 • edited Loading

jerch commented Dec 5, 2019 • edited Loading

d270624 commented Dec 6, 2019 • edited Loading

jerch commented Dec 6, 2019

jerch commented Dec 6, 2019

jerch commented Dec 6, 2019 • edited Loading

d270624 commented Dec 4, 2019 •

edited

Loading

d270624 commented Dec 5, 2019 •

edited

Loading

d270624 commented Dec 5, 2019 •

edited

Loading

jerch commented Dec 5, 2019 •

edited

Loading

d270624 commented Dec 6, 2019 •

edited

Loading

jerch commented Dec 6, 2019 •

edited

Loading