-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
terminal/parser: determine what to do when split escape sequence writes show up #4037
Comments
I don't think there's ever such a guarantee. Traffic might go over channels which don't even have this concept (e.g. the serial line). I don't know if the standard OS methods for reading from a serial line wait for more data before returning if data is currently flowing in, or return the already available chunk immediately. This might depend on some setting (e.g. O_NONBLOCK), and obviously on the OS. Traffic might go over layers and channels (e.g. TCP, ssh, tmux...) that are allowed to split or join fragments. That being said, if you press a key which generates a 3 byte sequence, and your terminal emulator writes that in a single step, you can be reasonably certain that it will arrive at the reading app in a single step. I recommend the terminal to write them in a single step. Some applications rely on heuristics (i.e. timing) to figure out what happened, e.g. to distingusing between Alt+[ followed by A, vs. the Up arrow. The most notable example is probably vim's handling of the Esc keypress. It's good to help them.
I can't see how this is relevant there, though. Doesn't this comment mix up the two directions? It's in |
Good point on the possibility of using timing heuristics.
Actually, the new ConHost does need to parse keypresses from VT sequences for backwards compatibility with the old Console API. See the architecture chart from the introducing blog post. |
If, as I understand it, this is about forwarding keypresses over the conpty connection, then I don't think it's worth trying to fix, because it's never going to satisfy everyone, and there are many other ways in which it can fail. In the long term I think the solution is to use a more robust protocol for keypresses that doesn't have these problems (see #879 for discussions along these lines). My personal preference would be the |
On localhost and fast networks you mostly get away with it but I think that is partly because the consequences of it going wrong are pretty mild - the user just presses the key again. There is no atomicity guarantee here, applications need to handle partial reads to be correct. |
I have used arrow keys over VT terminals (mostly with the mentioned terminal emulators) for years with all sorts of hardware and software, including embedded devices, without a problem. It is the de facto standard and only partially supporting it would force many applications to find workarounds.
For clarification: In my case, the issue described above is 100% reproducible, i.e. the arrow keys never work because they are always received as two reads from the (virtual) serial port. I added this to my issue description |
The backend in this case is not a VT application though. It's potentially a legacy Windows app that knows nothing about VT escape sequences, and might expect to be able to differentiate between simple keystrokes like Ctrl+2 and Ctrl+Space, which VT terminals treat as identical escape sequences. And that's not even considering more complicated scenarios, like say a piano playing app that needs to know exactly when each key is pressed and released, and be able to handle multiple keys held down at the same time. If Windows Terminal is going to be a viable replacement for the existing Windows console, it has to able to run existing Windows console apps, and that means the conpty layer needs something more powerful than basic VT sequences for forwarding keystrokes. Any actual VT applications on the server side wouldn't need to care, though. They'll still just receive a simple escape sequence if that's all they want. |
Testing on Windows Server 2019 (Version 10.0.17763.737) SAC / EMS with the same terminal emulator and serial port driver, the console channel type is @j4james |
I don't think you can really claim there is one "correct" way to handle split VT sequences though. Every choice involves some kind of compromise. Typically it would be up to receiving application to decide how to handle things based on their particular needs. And if conpty used a more robust protocol for passing on the keystrokes then it wouldn't need to worry about any of this - the control would be back in the hands of the receiving application. |
@j4james Merriam-Webster, Oxford Dictionary
The "conventional standard" here being the behavior shown (at least) by Linux VTE & other terminals, most embedded devices with serial connections and the Windows Server SAC/EMS itself. |
Searching for relevant standards covering this topic, I found an old manual stating that
Although there seem to be extensions and specification of edge cases making it desirable to implement DEC compatible terminal emulation (see here), the standard that introduced (among others) CSI escape sequences is: ANSI X3.64-1979 (Additional Controls for Use with American National Standard Code for Information Interchange) - Google Books I think the (informational purpose) Appendix section supports the viewpoint that the structure of the input stream should not affect the operation and that subsequent characters after control sequence introductions should be processed as parameters as long as they don't result in an invalid sequence. Excerpt from Appendix B (in our case the "device" would be ConPTY):
The current standard should be ISO/IEC 6429:1992, but it is not available for free and I do not have a copy to check. |
Any chance to move this issue from backlog or increase priority? After further research, the current open standard seems to be ECMA-48:
Relevant quote:
Source: https://www.ecma-international.org/publications-and-standards/standards/ecma-48/ (Section 6.2 - The data stream) ConPTY does treat input buffers as messages, breaking control sequences apart. This is invalid behaviour according to the standard. |
Thanks for following up. In the year since this issue was filed, it's gained very little support. I know that it's a standards compliance issue, and I value standards compliance, but I can't justify prioritizing it over some of the other stuff we've got going on. It's a complicated problem space where everyone has a different solution (a timeout if you receive Unfortunately, while the letter of the standard specifies that the data stream is continuous it does not appear to specify the behavior of a conforming application in the face of ambiguous stream contents. |
In practice the standard can specify all it likes, there is no way to ensure escape sequences are not broken up somewhere between terminal and application. Not to mention if the user for example presses It is a little unfortunate that you do this so aggressively because it will break naive applications like showkey and any application where the user disables the timeout, but it is something that applications have to handle. |
@DHowett Thanks for your reply and clarification. It also seems to be possible to work around this with solutions approved by Microsoft engineering, since there is a working implementation for Windows Server SAC, which I tested in the same environment (see #4037 (comment)). The standard defines that action in error recovery is left to the implementation. However, if a control sequence is started in one input and not finished (but also not invalid), I think the correct course of action is to wait for the next input since otherwise ConPTY does not work on an input stream but on individual input buffers (violating the standard). The reasoning for the current implementation seems flawed to me (see code comments or #2823 (comment)), since it says e.g.
But testing with different Linux terminal emulators, it is exactly the observed behaviour:
This cannot be guaranteed for many applications and the assumption effectively violates the standard (see above). |
This comment was marked as off-topic.
This comment was marked as off-topic.
8-bit response mode solves that problem, any 8-bit control is a native continuator, so the parser could delay the parsing, 7-bit control wont go anywhere 's futile, because 1B has a meaning by itself. also the protocol, the world needs a nice key protocol |
@determin1st You're probably more interested in #4999, where we literally made a protocol for transmitting Win32 console input sequences (nearly 4 years ago now) |
@zadjii-msft oke, ill look through.. for now can say that INPUT_RECORD is probably overkill, KEY_RECORD yes, mouse tracking protocol exists already, would be more compatibility.. |
ooke, as im progressing in construction of my own parser, i see that 8-bit responses dont fully solve the partial input problem (some edge cases). there must be a partial input terminator. Putty supports ENQ request-response, but it is not wildly supported. i think another basic request - the DA1 is appropriate, because its result/answer is already known to the terminal user (any library does it upon initiation) so after the parser returns the partial input state, stores the remaining bytes for the next attempt, it also should do the DA1 request to have a clarity whether no more bytes would follow the partial. this nicely closes all edge cases. waiting for known input is better than waiting for unknown input. also i was browsing through some of docs here and found this https://github.com/microsoft/terminal/blob/main/doc/specs/%23976%20-%20VT52%20escape%20sequences.md 's like with INPUT_RECORD miswording, ANSI/VT52 is a single mode it's not leaving VT52 to ANSI or entering ANSI leaving VT52 vice versa, its the same thing. those ancient docs didnt clearly say what mode activates, "previous" kind, probably DEC thing or ASCII thing. dont understand why any modern terminal user need that. should be always off |
Environment
Steps to reproduce
Example for arrow-up key (as seen during development with serial ports):
Write 1: 0x1b
Write 2: 0x5b 0x41
Expected behavior
Example for arrow-up key:
Actual behavior
Due to
FlushAtEndOfString
being true, the escape sequence is split up and rendered wrong. E.g. arrow-up becomes:which only renders "[A" to the PTY without the desired effect.
This is 100% reproducible, i.e. the arrow keys never work because they are always received as two reads from the (virtual) serial port.
Note: A custom build of conhost.exe (OpenConsole.exe) with
FlushAtEndOfString
changed to return false does correctly process split escape sequence writes.Comments
The comment in the caller in stateMachine.cpp says the following:
I think both statements here are wrong, or at least not universally true. Typing Alt+[ and A on a VTE based Linux terminal emulator (e.g. gnome-terminal or xfce4-terminal) will result in the same reaction as pressing the arrow-up key.
Is there any standard that guarantees VT input sequences to happen in single writes? I tested Linux support of split escape sequence writes - it worked fine on Linux 5.4 framebuffer getty TTY and on all tested PTY terminal emulators (xterm, VTE, etc.).
Related: #2823
The text was updated successfully, but these errors were encountered: