Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pasting a single certain Unicode character results in two garbage characters being pasted instead #9879

Closed
snaar opened this issue Apr 17, 2021 · 7 comments
Labels
Area-Input Related to input processing (key presses, mouse, etc.) Resolution-Duplicate There's another issue on the tracker that's pretty much the same thing.

Comments

@snaar
Copy link

snaar commented Apr 17, 2021

Windows Terminal version (or Windows build number)

10.0.18363.1440, 1.7.1033.0

Other Software

N/A

Steps to reproduce

  1. Copy some "high" unicode character into buffer, for example U+1FB37 '🬷'. You can also copy it from https://www.fileformat.info/info/unicode/char/1fb37/browsertest.htm
  2. Paste it into terminal.
  3. Optionally paste it into notepad for comparison.

Expected Behavior

Expected actual character. Or if font doesn't support it - single box character.

Actual Behavior

After pasting terminal shows two box characters:
image

If you paste same thing into notepad it shows up correctly:
image

Incidentally, if you paste it into old cmd.exe, it shows up as box, but still only a single box:
image

It's possible it's related to #9308

@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Apr 17, 2021
@snaar snaar changed the title Pasting certain Unicode characters result in duplicate garbage characters being pasted instead Pasting a single certain Unicode character results in two garbage characters being pasted instead Apr 17, 2021
@zadjii-msft
Copy link
Member

I think this is actually more broadly #190 and #1472. cmd.exe and Windows Powershell are notoriously bad at handling unicode characters.

Er, after reading more carefully, this might even be #9052

@zadjii-msft zadjii-msft added the Area-Input Related to input processing (key presses, mouse, etc.) label Apr 29, 2021
@sba923
Copy link

sba923 commented May 5, 2021

That's really annoying when pasting file/folder paths copied from File Explorer.

Every day I face paths like:

C:\Users\<REDACTED>\<REDACTED>\Innovation Labs – Projects - Documents
          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
0000000000000000 43 3A 5C 55 73 65 72 73 5C 3C 52 45 44 41 43 54 C:\Users\<REDACT
0000000000000010 45 44 3E 5C 3C 52 45 44 41 43 54 45 44 3E 5C 49 ED>\<REDACTED>\I
0000000000000020 6E 6E 6F 76 61 74 69 6F 6E 20 4C 61 62 73 20 E2 nnovation Labs â
0000000000000030 80 93 20 50 72 6F 6A 65 63 74 73 20 2D 20 44 6F �� Projects - Do
0000000000000040 63 75 6D 65 6E 74 73                            cuments

and I can't paste them into PowerShell commands running within Windows Terminal.

@zadjii-msft
Copy link
Member

Is this still an issue? If my theory was right that this was #9052, that was fixed in 66b9b9d which shipped in v1.9.1445.0, back in late May 2021.

@zadjii-msft zadjii-msft added the Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something label Feb 23, 2022
@snaar
Copy link
Author

snaar commented Feb 23, 2022

Still reproduceable in 1.11.3471.0.

@ghost ghost added Needs-Attention The core contributors need to come back around and look at this ASAP. and removed Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something labels Feb 23, 2022
@snaar
Copy link
Author

snaar commented Feb 23, 2022

Actually it looks like there is regression in cmd.exe now too - it shows two box characters now instead of one like it used too:
image

@zadjii-msft
Copy link
Member

Okay then this is probably just /dup #1503 which covers the much broader issue here. "emoji" is being used as a synonym for "anything that isn't just a single WCHAR". Thanks for following up!

@ghost
Copy link

ghost commented Feb 23, 2022

Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report!

@ghost ghost added Resolution-Duplicate There's another issue on the tracker that's pretty much the same thing. and removed Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements Needs-Attention The core contributors need to come back around and look at this ASAP. labels Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Input Related to input processing (key presses, mouse, etc.) Resolution-Duplicate There's another issue on the tracker that's pretty much the same thing.
Projects
None yet
Development

No branches or pull requests

4 participants