Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious unicode replacement character when reading emoji from psuedo console #3421

Closed
wez opened this issue Nov 3, 2019 · 4 comments
Closed
Assignees
Labels
Needs-Tag-Fix Doesn't match tag requirements Resolution-Answered Related to questions that have been answered

Comments

@wez
Copy link

wez commented Nov 3, 2019

Environment

Windows build number: Microsoft Windows [Version 10.0.18362.418]
Windows Terminal version (if applicable):  Windows Terminal (Preview)
Version: 0.6.2951.0

Any other software? https://github.com/wez/wezterm terminal emulator

Steps to reproduce

Launch cmd.exe in the terminal (eg: just start wezterm.exe) and execute:

bash -c "printf a\\\\360\\\\237\\\\220\\\\267b"

Expected behavior

This should display:

a🐷b

Actual behavior

In wezterm when using conpty this renders as follows, with a spurious unicode replacement character present before the pig face emoji:

a�🐷b

(the replacement character may render as a ? in wezterm depending on font configuration)

wezterm using a direct ssh connection to localhost spawned via wezterm ssh localhost (conpty is not involved on the client side, but is on the server sde) also produces the spurious replacement character.

wezterm using an ssh connection to a unix machine, or via a unix domain socket to WSL doesn't have the replacement character.

The windows terminal preview renders correctly when run locally, but if you:

ssh localhost
bash -c "printf a\\\\360\\\\237\\\\220\\\\267b"

you'll see the offending replacement character in the windows terminal too.

Discussion

I can't explain why the replacement character is present in the data that wezterm reads from the pty; it feels environmental somehow and I'd love to know how to force this to work correctly in code!

For the curious, the pty is setup here and its reader thread is here

If I instrument the reader code to print the codepoints and then type:

cls
<up-arrow>
<up-arrow>
<enter>  # this re-runs `bash -c "printf a\\\\360\\\\237\\\\220\\\\267b"`

I get the following data out from the pty, starting with the c, l, s being printed in those first three lines (all codepoints are shown in hex):

 ERROR wezterm::mux                        > read chars: [63]
 ERROR wezterm::mux                        > read chars: [6c]
 ERROR wezterm::mux                        > read chars: [73]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, d, a, 1b, 5b,
3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, 1b, 5b, 48, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 77, 65, 7a, 40, 47, 4c, 4f, 57, 59, 20, 43, 3a, 5c,
3e, 1b, 5b, 36, 36, 58, 1b, 5b, 36, 36, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a,
1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b,
5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b,
38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38,
30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30,
58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58,
1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b,
5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b,
38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38,
30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30,
43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43,
1b, 5b, 32, 3b, 31, 35, 48, 1b, 5b, 3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, 63, 6c, 73, 1b, 5b, 3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, 1b, 5b, 3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, 1b, 5b, 32, 3b, 31, 35, 48, 62, 61, 73, 68, 20, 2d, 63, 20, 22, 70, 72, 69, 6e, 74, 66, 20, 61, 5c, 5c, 5c, 5c, 33, 36, 30, 5c, 5c, 5c, 5c, 32, 33, 37, 5c, 5c, 5c, 5c, 32, 32, 30, 5c, 5c, 5c, 5c, 32, 36, 37, 62, 22, 1b, 5b, 3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, 1b, 5b, 3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, d, a, 1b, 5b,
3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, 1b, 5b, 48, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 77, 65, 7a, 40, 47, 4c, 4f, 57, 59, 20, 43, 3a, 5c,
3e, 62, 61, 73, 68, 20, 2d, 63, 20, 22, 70, 72, 69, 6e, 74, 66, 20, 61, 5c, 5c, 5c, 5c, 33, 36,
30, 5c, 5c, 5c, 5c, 32, 33, 37, 5c, 5c, 5c, 5c, 32, 32, 30, 5c, 5c, 5c, 5c, 32, 36, 37, 62, 22,
1b, 5b, 31, 39, 58, 1b, 5b, 31, 39, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b,
5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b,
38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38,
30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30,
58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58,
1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b,
5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b,
38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38,
30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30,
43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43,
d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, 1b,
5b, 33, 3b, 31, 48, 1b, 5d, 30, 3b, 41, 64, 6d, 69, 6e, 69, 73, 74, 72, 61, 74, 6f, 72, 3a, 20,
63, 3a, 5c, 77, 69, 6e, 64, 6f, 77, 73, 5c, 73, 79, 73, 74, 65, 6d, 33, 32, 5c, 63, 6d, 64, 2e,
65, 78, 65, 20, 2d, 20, 62, 61, 73, 68, 20, 20, 2d, 63, 20, 22, 70, 72, 69, 6e, 74, 66, 20, 61,
5c, 5c, 5c, 5c, 33, 36, 30, 5c, 5c, 5c, 5c, 32, 33, 37, 5c, 5c, 5c, 5c, 32, 32, 30, 5c, 5c, 5c,
5c, 32, 36, 37, 62, 22, 7, 1b, 5b, 3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, 1b, 5b, 48, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 77, 65, 7a, 40, 47, 4c, 4f, 57, 59, 20, 43, 3a, 5c,
3e, 62, 61, 73, 68, 20, 2d, 63, 20, 22, 70, 72, 69, 6e, 74, 66, 20, 61, 5c, 5c, 5c, 5c, 33, 36,
30, 5c, 5c, 5c, 5c, 32, 33, 37, 5c, 5c, 5c, 5c, 32, 32, 30, 5c, 5c, 5c, 5c, 32, 36, 37, 62, 22,
1b, 5b, 31, 39, 58, 1b, 5b, 31, 39, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b,
5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b,
38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38,
30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30,
58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58,
1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b,
5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b,
38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38,
30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30,
43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43,
d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, 1b,
5b, 33, 3b, 31, 48, 1b, 5b, 3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, 61, fffd, fffd, 1b, 5b, 3f, 32, 35, 68]
 ERROR wezterm::mux                        > read chars: [1b, 5b, 3f, 32, 35, 6c, 1b, 5b, 48, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 77, 65, 7a, 40, 47, 4c, 4f, 57, 59, 20, 43, 3a, 5c,
3e, 62, 61, 73, 68, 20, 2d, 63, 20, 22, 70, 72, 69, 6e, 74, 66, 20, 61, 5c, 5c, 5c, 5c, 33, 36,
30, 5c, 5c, 5c, 5c, 32, 33, 37, 5c, 5c, 5c, 5c, 32, 32, 30, 5c, 5c, 5c, 5c, 32, 36, 37, 62, 22,
1b, 5b, 31, 39, 58, 1b, 5b, 31, 39, 43, d, a, 61, fffd, 1f437, 62, 1b, 5b, 37, 34, 58, 1b, 5b, 37, 34, 43, d, a, 77, 65, 7a, 40, 47, 4c, 4f, 57, 59, 20, 43, 3a, 5c, 3e, 1b, 5b, 36, 36, 58, 1b, 5b, 36, 36, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d,
a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, d, a, 1b, 5b, 38, 30, 58, 1b, 5b, 38, 30, 43, 1b, 5b, 34, 3b, 31, 35, 48, 1b, 5d, 30, 3b, 41, 64, 6d, 69, 6e, 69, 73, 74, 72, 61, 74, 6f, 72, 3a, 20, 63, 3a, 5c, 77, 69, 6e, 64, 6f, 77, 73, 5c, 73, 79, 73, 74, 65, 6d, 33, 32, 5c, 63, 6d, 64, 2e, 65, 78, 65, 7,
1b, 5b, 3f, 32, 35, 68]

The second-from-last-line emits a pair of fffd codepoints for some reason before the bigger final line which has the fffd (replacement character), 1f437 (pig face) sequence.

Another interesting datapoint is that if I run this in cmd.exe:

bash -c "printf a\\\\360\\\\237\\\\220\\\\267b" > c:\users\wez\gah.txt

and then inspect the file from wsl:

$ xxd gah.txt
00000000: 61f0 9f90 b762                           a....b

this shows that it isn't bash inserting the data.

@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Nov 3, 2019
@DHowett-MSFT
Copy link
Contributor

I think this is something we actually fixed in conhost for 20H1. This might sound crazy, but if you replace ...\system32\conhost.exe with a release build of OpenConsole from this repository, does your problem persist?

We had an internal bug tracking this, and we landed it before we went open-source, so there's no commit I can point to that fixed it. 😄

FWIW: I have a locally-built version of PuTTY (pwincon) that uses ConPTY to host a local shell, and it'll fall back to an OpenConsole.exe if one is present in the same folder as pwincon.

Here's what I get:

pwincon with system conhost/ConPTY

image

pwincon with sxs openconsole.exe from master

image

@wez
Copy link
Author

wez commented Nov 3, 2019

I can confirm that installing a current OpenConsole.exe over conhost.exe (a little scary!) resolves this; thanks!

@wez
Copy link
Author

wez commented Nov 3, 2019

Can you share how putty falls back to openconsole.exe in code? I'd like to undo the conhost.exe change and build that logic into my pty code :)

@DHowett-MSFT DHowett-MSFT self-assigned this Nov 4, 2019
@DHowett-MSFT DHowett-MSFT removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Nov 4, 2019
@DHowett-MSFT
Copy link
Contributor

Actually, we'll probably have something in the middle-term future that'll help applications ship against newer versions of the pseudoconsole host 😄

There's an interface for launching a console host (openconsole, conhost) directly into PTY mode, but it's really not intended as a permanent supported API surface. There's a lot of annoying responsibilities a pty host consumer takes on that the API/the system-shipped version of conhost ameliorates. Chief among those is that the pty host needs to match the architecture of the kernel, not of the application requesting a pty 😄

I'm gonna close this one for now and stick it on to #3577, with the caveat that we'll get something not for internal use at some point.

@DHowett-MSFT DHowett-MSFT added the Resolution-Answered Related to questions that have been answered label Nov 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs-Tag-Fix Doesn't match tag requirements Resolution-Answered Related to questions that have been answered
Projects
None yet
Development

No branches or pull requests

2 participants