Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't crash when reading emoji input in utf8 #12342

Closed
wants to merge 7 commits into from

Conversation

zadjii-msft
Copy link
Member

Summary of the Pull Request

In the two places we read input from the console input buffer, we were not accounting for the fact that unicode input might turn into more than two input records.

InputBuffer::Read tries to account for how big unicode characters will be when read as input events, and only returns as much input as will fit in the caller's buffer. However, we were using IsGlyphFullWidth to check if a glyph was wide. However, that's not always right. SplitToOem might make three chars out of a single key, like in the case of emoji in utf8.

PR Checklist

Validation Steps Performed

@ghost ghost added Area-Input Related to input processing (key presses, mouse, etc.) Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-1 A description (P1) Product-Conhost For issues in the Console codebase Severity-Crash Crashes are real bad news. labels Feb 3, 2022
@github-actions

This comment has been minimized.

Comment on lines +431 to 434
if (virtualReadCount + sizeOfThisKey > readCount)
{
++virtualReadCount;
break;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop now has two bounds checks instead of one, which might be a bit confusing. We can simplify it by changing the loop to be while (!_storage.empty()) and replacing this if condition with the following:

if (!unicode)
{
    ...
}

virtualReadCount += sizeOfThisKey;
if (virtualReadCount > readCount)
{
    break;
}

if (performNormalRead)
{
    ...
}

Copy link
Member

@lhecker lhecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd approve this code, but I don't get what virtualReadCount is supposed to be. The comment for it implies it counts the number of columns ("DBCS records". Right?), which matches the old code, but the new code counts code points. What if the CP is 65001?
Finally the comment for the readCount parameter implies it counts the number of events, but if that was true we'd only increment virtualReadCount whenever we push into the outEvents vector and we don't do that either.

In other words: I don't really get the code. 😅 I can't judge if this code would break anything else in the project. But it's syntactically correct, so... I'd approve it.

@miniksa
Copy link
Member

miniksa commented Feb 8, 2022

The test is now stuck in an infinite loop so something's wrong here.

@carlos-zamora
Copy link
Member

Circling back on this PR. Assigning @zadjii-msft since the test is stuck on an infinite loop.

@github-actions
Copy link

@check-spelling-bot Report

🔴 Please review

See the 📂 files view or the 📜action log for details.

Unrecognized words (52)
AOn
apimswincoresynchl
AStomps
BGRA
Childitem
Consolescreen
CPPARM
cppm
CPPx
DCompile
DECRST
eachother
FARPROC
fmix
FNV
forground
guiddef
halfwidth
HPAINTBUFFER
HPROPSHEETPAGE
icch
intrin
iwch
ixx
LPCHARSETINFO
ned
newcursor
nlength
nodefaultlib
notmatch
osfhandle
outout
PENDTASKMSG
ppci
PPORT
precendence
Progman
QWERTYUIOP
redistributable
SHANDLE
SHGFP
umd
UWA
WClass
wddmconrenderer
webclient
wfdopen
winsdk
wline
wlinestream
WResult
WSpace
Previously acknowledged words that are now absent aabbcc abbcc ABCDEFGHIJKLMNOPQRSTUVWXY ABORTIFHUNG appcontainer bgra chaof COLORMATRIX CTRLFREQUENCY CTRLVOLUME CUsers dcompile DECARM DECBKM DECCARA DECCRA DECCTR DECERA DECFRA DECPS DECRARA DECRQTSR decrst DECSACE DECSERA DECXCPR DSBCAPS DSBLOCK DSBPLAY DSBUFFERDESC DSBVOLUME dsound DSSCL dxguid ect ENTIREBUFFER eplace Geddy GGI GLOBALFOCUS hicon hrottled icket IGNORELANGUAGE IMPEXP IWIC IXP KOK lpwfx luma NOTIMEOUTIFNOTHUNG nto nugetversions otepad ploc ploca plocm Qaabbcc QUERYOPEN QUZ Qxxxxxxxxxxxxxxx Requiresx sapi SFGAO SFGAOF SFolder SHOWDEFAULT SHOWNA ssa Tdd TOOLWINDOW Tribool umul umulh usp uwa vtapi WAVEFORMATEX wic wincodec windowsterminal wyhash wymix wyr xin xinchaof xinxinchaof XTWINOPS xwwyzz xxyyzz Zabcdefghijklmnopqrstuvwxyz ZYXWVU ZYXWVUTd :arrow_right:
To accept ✔️ these unrecognized words as correct and remove the previously acknowledged and now absent words, run the following commands

... in a clone of the git@github.com:microsoft/terminal.git repository
on the dev/migrie/b/8663-input-to-oem-crash branch (ℹ️ how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/v0.0.21/apply.pl' |
perl - 'https://github.com/microsoft/terminal/actions/runs/3568345734/attempts/1'
Errors (1)

See the 📂 files view or the 📜action log for details.

❌ Errors Count
❌ forbidden-pattern 33

See ❌ Event descriptions for more information.

✏️ Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

⚠️ The command is written for posix shells. If it doesn't work for you, you can manually add (one word per line) / remove items to expect.txt and the excludes.txt files.

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

🔬 You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. 😉

If the flagged items are 🤯 false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

@lhecker
Copy link
Member

lhecker commented Feb 1, 2023

#14745 is an alternative approach. I've tested it with the repros in #8663 and it works great.

@DHowett
Copy link
Member

DHowett commented Feb 1, 2023

@zadjii-msft, mind if we close this one in favor of Leonard's rewrite?

@zadjii-msft
Copy link
Member Author

please yes

@zadjii-msft zadjii-msft closed this Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Input Related to input processing (key presses, mouse, etc.) Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-1 A description (P1) Product-Conhost For issues in the Console codebase Severity-Crash Crashes are real bad news.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpenConsole crash when pasting emoji
5 participants