Spurious unicode replacement character when reading emoji from psuedo console #3421
Labels
Needs-Tag-Fix
Doesn't match tag requirements
Resolution-Answered
Related to questions that have been answered
Environment
Steps to reproduce
Launch
cmd.exe
in the terminal (eg: just startwezterm.exe
) and execute:Expected behavior
This should display:
Actual behavior
In wezterm when using conpty this renders as follows, with a spurious unicode replacement character present before the pig face emoji:
(the replacement character may render as a
?
in wezterm depending on font configuration)wezterm using a direct ssh connection to localhost spawned via
wezterm ssh localhost
(conpty is not involved on the client side, but is on the server sde) also produces the spurious replacement character.wezterm using an ssh connection to a unix machine, or via a unix domain socket to WSL doesn't have the replacement character.
The windows terminal preview renders correctly when run locally, but if you:
you'll see the offending replacement character in the windows terminal too.
Discussion
I can't explain why the replacement character is present in the data that wezterm reads from the pty; it feels environmental somehow and I'd love to know how to force this to work correctly in code!
For the curious, the pty is setup here and its reader thread is here
If I instrument the reader code to print the codepoints and then type:
I get the following data out from the pty, starting with the
c
,l
,s
being printed in those first three lines (all codepoints are shown in hex):The second-from-last-line emits a pair of
fffd
codepoints for some reason before the bigger final line which has thefffd
(replacement character),1f437
(pig face) sequence.Another interesting datapoint is that if I run this in
cmd.exe
:and then inspect the file from wsl:
this shows that it isn't
bash
inserting the data.The text was updated successfully, but these errors were encountered: