Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #3767 - Adds Ansi parser and scheduler. #3791

Merged
merged 116 commits into from
Dec 7, 2024
Merged

Conversation

tznind
Copy link
Collaborator

@tznind tznind commented Oct 11, 2024

Adds ability to send and respond to ANSI escape sequences as they come in via event callbacks.

Also adds detection of Sixel support, see SixelSupportDetector.cs

The purpose is to stream text from console driver as read and 'transparently' pluck out the expected responses live.

Explanation of Ansi Parser

AnsiParser processes events from an input stream. Each time it will take 1 char and either add it to Held or Release it. Critically it sometimes releases more than 1 event.

Each event is a char which can have optional metadata attached (See T header below).

Consider the following use case:

User presses Esc key then types "Hi" then hits Esc again. While the user is typing a DAR response comes in <esc>[0c

image

Here is a breakdown of how AnsiParser would deal with this:

Stage 1 - Consume first Esc

The first call to ProcessInput is with the Esc char. This causes the parser to shift into expecting an escape code (i.e. a bracket). Because we are assuming the Esc is a response we hold it and return empty.

image

Stage 2 - Consume H

The next call to ProcessInput is with H. We are expecting a bracket which would indicate that we are indeed entering an escape sequence. Since we did not find one ([), we instead release both the Esc and the H for routine processing.

image

Stage 3 - Consume next escape sequence

This process repeats, this time for the actual response we are expecting (i.e. ending with c)

image

image

image

When we reach the first character that is a _knownTerminators (i.e. c) or matches one of our expectedResponses terminators we will leave the InResponse state.

image

If the response built up in Held matches an expectedResponses we swallow it (Release None) and raise event
If the response built up does not match (instead it matches _knownTerminators) we release the whole thing back to be processed downstream.

Stage 4 - Consume last Esc

Finally we consume the last Esc which leaves us in state Expecting Bracket. Now this may be the start of a new escape sequence or it could be that the user has just pressed Esc - we will not know which till later.

BUT we don't want to sit around waiting for the rest of the escape sequence forever. Pressing Esc could be the last thing the user does and there could be no events while user sits around watiting for app to respond.

For this reason we put timestamp on state changes StateChangedAt - this lets the caller (e.g. driver main loop) force the Parser to release the Esc after a short period of time if no new input is comming:

image

if (Parser.State == ParserState.ExpectingBracket &&
    DateTime.Now - Parser.StateChangedAt > _escTimeout)
{
    return Parser.Release ().Select (o => o.Item2);
}

Why <T>?

I realized working exclusively in char and string made it difficult for WindowsDriver to integrate - so I have changed to AnsiParser<T> . The class now deals with sequences of char each of which has any metadata you want (type T). For WindowsDriver this means AnsiResponseParser<WindowsConsole.InputRecord>.

The parser can pull things out of the stream and later return them and we don't loose the metadata.

I will look at the other drivers, if they are dealing just in char with no other metadata I can see about putting a non generic one too that just wraps the generic like TreeView vs TreeView<T>

Outstanding Issues

  • Key down and key up means double key presses
  • Esc on its own as isolated keypress needs to have a release timeout (i.e. if we dont see '[' in 50ms then we need to release that back to input stream as nothing else is coming).
  • deal/prevent parallel outstanding requests executing at once
  • Hitting Esc key twice in quick succession or users Esc+start escape sequence

Fixes

Pull Request checklist:

  • I've named my PR in the form of "Fixes #issue. Terse description."
  • My code follows the style guidelines of Terminal.Gui - if you use Visual Studio, hit CTRL-K-D to automatically reformat your files before committing.
  • My code follows the Terminal.Gui library design guidelines
  • I ran dotnet test before commit
  • I have made corresponding changes to the API documentation (using /// style comments)
  • My changes generate no new warnings
  • I have checked my code and corrected any poor grammar or misspellings
  • I conducted basic QA to assure all features are working

@tznind
Copy link
Collaborator Author

tznind commented Oct 12, 2024

Key down and Key up events are a pain. Here is what response looks like when sending a Device Attributes Request on startup in WindowsDriver.

We get keydown and then keyup for each letter.
esc[?1;0c

image

So we would need another 'pre step' to the parser that paired up down/up events and only passed them around in pairs (i.e. swallow both or release both). Currently I am using T as InputRecord but now maybe it needs to be a Pair (down and up).

Any idea if this is universal behaviour for WindowsDriver? or are there cases where it only sends key downs for example?

@BDisp
Copy link
Collaborator

BDisp commented Oct 12, 2024

Any idea if this is universal behaviour for WindowsDriver? or are there cases where it only sends key downs for example?

I think that is a behavior for Wind32 API but would be interesting to know how that work with the others drivers, because you are sending any keystroke with the prefix 0x1b, right? If the other drivers respond with the same output, then there is a possibility that all drivers can deal with key-down and keu-up.

@tznind
Copy link
Collaborator Author

tznind commented Oct 12, 2024

Ok I have added WindowsDriverKeyPairer to match up input key down/up pairs - good news is no changes to the parser we just make the 'metadata' for each char the pair and return that to the main loop.

d642fb6

@tznind
Copy link
Collaborator Author

tznind commented Oct 12, 2024

Key down/up is very unreliable. Just by typing fast you get random order of down/ups (this is on v2_develop):

image

@tznind
Copy link
Collaborator Author

tznind commented Oct 12, 2024

Maybe we can ditch key up event for v2? it seems almost everything goes of key down anyway? And 2 of the 3 drivers are just making up down/up as pairs anyway - for example NetDriver:

                KeyCode map = MapKey (consoleKeyInfo);

                if (map == KeyCode.Null)
                {
                    break;
                }

                OnKeyDown (new Key (map));
                OnKeyUp (new Key (map));

                break;

In other news though, here is the current WIP, we can send DAR request on demand. It has issues though - crashes a lot. Issues are:

  • key down/up pair processing
  • multi threading/instance (enumeration modified...)
  • not implemented for curses/netdriver/fakedriver

send-dar

@tznind
Copy link
Collaborator Author

tznind commented Nov 27, 2024

Seems like this test failure situation is non deterministic, it failed in b0e921a but passed in next commit (1480f13) which was just a rename.

@tig
Copy link
Collaborator

tig commented Nov 27, 2024

image

Yeah, non-deterministic. Drives me nuts.

@BDisp
Copy link
Collaborator

BDisp commented Nov 27, 2024

Seems like this test failure situation is non deterministic, it failed in b0e921a but passed in next commit (1480f13) which was just a rename.

The failing method is ClearScreenNextIteration_Resets_To_False_After_LayoutAndDraw. Probably a race condition that some values aren't yet dispatched when another unit test is ran. These are difficult to find what it's causing failing.

@BDisp
Copy link
Collaborator

BDisp commented Nov 27, 2024

I also already faced these failures when calling Application.Init and Application.Shutdown in a unit test without using an attribute that ensures Application.Init is called without a previous call to Application.Shutdown nor Application.Shutdown is called without a previous call to Application.Init.

Copy link
Collaborator

@tig tig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted a PR instead of doing a full reveiw in line...

@tig
Copy link
Collaborator

tig commented Nov 27, 2024

tznind#173

Code review comments and cleanup
@tznind
Copy link
Collaborator Author

tznind commented Nov 27, 2024

Great thanks, I will look through the BUGBUG when I get a chance and fix

@tznind
Copy link
Collaborator Author

tznind commented Nov 27, 2024

Ok I think I have addressed most of the TODO / BUGBUG

The only ones I have left were around EscSeqUtils which I have not modified in this PR and which are quite intimidating - code wise.

I looked at using Pos align in scenario but I think the reason for gap is because of combo boxes nature to not pop over properly. I couldnt see how pos align would help especially since there are multiple label/text fields per row - we dont want to just stack the views on top of each other vertically.

@BDisp
Copy link
Collaborator

BDisp commented Nov 28, 2024

Ok I think I have addressed most of the TODO / BUGBUG

The only ones I have left were around EscSeqUtils which I have not modified in this PR and which are quite intimidating - code wise.

The changes I've made with the EscSeqUtils allow a driver to handle with all keys, mouse and request ANSI responses. When I was done the changes in the CursesDriver in the closed PR it only used their methods, mainly the DecodeEscSeq. But of course it can be improved.

I looked at using Pos align in scenario but I think the reason for gap is because of combo boxes nature to not pop over properly. I couldnt see how pos align would help especially since there are multiple label/text fields per row - we dont want to just stack the views on top of each other vertically.

I did some changes on my closed PR related with the scenario. Don't know if it will help you.

private View BuildSingleTab ()
{
    var w = new View
    {
        Width = Dim.Fill (),
        Height = Dim.Fill (),
        CanFocus = true
    };

    w.Padding.Thickness = new (1);

    // TODO: This hackery is why I think the EscSeqUtils class should be refactored and the CSI's made type safe.
    List<string> scrRequests = new ()
    {
        "CSI_SendDeviceAttributes",
        "CSI_ReportTerminalSizeInChars",
        "CSI_RequestCursorPositionReport",
        "CSI_SendDeviceAttributes2"
    };

    var cbRequests = new ComboBox { Width = 40, Height = 5, ReadOnly = true, Source = new ListWrapper<string> (new (scrRequests)) };
    w.Add (cbRequests);

    // TODO: Use Pos.Align and Dim.Func so these hardcoded widths aren't needed.
    var label = new Label { Y = Pos.Bottom (cbRequests) + 1, Text = "_Request:" };
    var tfRequest = new TextField { X = Pos.Left (label), Y = Pos.Bottom (label), Width = 20 };
    w.Add (label, tfRequest);

    label = new () { X = Pos.Right (tfRequest) + 1, Y = Pos.Top (tfRequest) - 1, Text = "E_xpectedResponseValue:" };
    var tfValue = new TextField { X = Pos.Left (label), Y = Pos.Bottom (label), Width = 6 };
    w.Add (label, tfValue);

    label = new () { X = Pos.Left (tfValue) + label.Text.Length, Y = Pos.Top (tfValue) - 1, Text = "_Terminator:" };
    var tfTerminator = new TextField { X = Pos.Left (label), Y = Pos.Bottom (label), Width = 4 };
    w.Add (label, tfTerminator);

    cbRequests.SelectedItemChanged += (s, e) =>
                                      {
                                          if (cbRequests.SelectedItem == -1)
                                          {
                                              return;
                                          }

                                          string selAnsiEscapeSequenceRequestName = scrRequests [cbRequests.SelectedItem];
                                          AnsiEscapeSequenceRequest selAnsiEscapeSequenceRequest = null;

                                          switch (selAnsiEscapeSequenceRequestName)
                                          {
                                              case "CSI_SendDeviceAttributes":
                                                  selAnsiEscapeSequenceRequest = AnsiEscapeSequenceRequestUtils.CSI_SendDeviceAttributes;

                                                  break;
                                              case "CSI_ReportTerminalSizeInChars":
                                                  selAnsiEscapeSequenceRequest = AnsiEscapeSequenceRequestUtils.CSI_ReportTerminalSizeInChars;

                                                  break;
                                              case "CSI_RequestCursorPositionReport":
                                                  selAnsiEscapeSequenceRequest = AnsiEscapeSequenceRequestUtils.CSI_RequestCursorPositionReport;

                                                  break;
                                              case "CSI_SendDeviceAttributes2":
                                                  selAnsiEscapeSequenceRequest = AnsiEscapeSequenceRequestUtils.CSI_SendDeviceAttributes2;

                                                  break;
                                          }

                                          tfRequest.Text = selAnsiEscapeSequenceRequest is { } ? selAnsiEscapeSequenceRequest.Request : "";

                                          tfValue.Text = selAnsiEscapeSequenceRequest is { }
                                                             ? selAnsiEscapeSequenceRequest.ExpectedResponseValue ?? ""
                                                             : "";
                                          tfTerminator.Text = selAnsiEscapeSequenceRequest is { } ? selAnsiEscapeSequenceRequest.Terminator : "";
                                      };

    // Forces raise cbRequests.SelectedItemChanged to update TextFields
    cbRequests.SelectedItem = 0;

    label = new () { Y = Pos.Bottom (tfRequest) + 2, Text = "_Response:" };
    var tvResponse = new TextView { X = Pos.Left (label), Y = Pos.Bottom (label), Width = 40, Height = 4, ReadOnly = true };
    w.Add (label, tvResponse);

    label = new () { X = Pos.Right (tvResponse) + 1, Y = Pos.Top (tvResponse) - 1, Text = "_Error:" };
    var tvError = new TextView { X = Pos.Left (label), Y = Pos.Bottom (label), Width = 40, Height = 4, ReadOnly = true };
    w.Add (label, tvError);

    label = new () { X = Pos.Right (tvError) + 1, Y = Pos.Top (tvError) - 1, Text = "_Value:" };
    var tvValue = new TextView { X = Pos.Left (label), Y = Pos.Bottom (label), Width = 6, Height = 4, ReadOnly = true };
    w.Add (label, tvValue);

    label = new () { X = Pos.Right (tvValue) + 1, Y = Pos.Top (tvValue) - 1, Text = "_Terminator:" };
    var tvTerminator = new TextView { X = Pos.Left (label), Y = Pos.Bottom (label), Width = 4, Height = 4, ReadOnly = true };
    w.Add (label, tvTerminator);

    var btnResponse = new Button { X = Pos.Center (), Y = Pos.Bottom (tvResponse) + 2, Text = "_Send Request", IsDefault = true };

    var lblSuccess = new Label { X = Pos.Center (), Y = Pos.Bottom (btnResponse) + 1 };
    w.Add (lblSuccess);

    btnResponse.Accepting += (s, e) =>
                             {
                                 var ansiEscapeSequenceRequest = new AnsiEscapeSequenceRequest
                                 {
                                     Request = tfRequest.Text,
                                     Terminator = tfTerminator.Text,
                                     ExpectedResponseValue = string.IsNullOrEmpty (tfValue.Text) ? null : tfValue.Text
                                 };

                                 bool success = Application.Driver!.TryWriteAnsiRequest (
                                                                                         Application.MainLoop!.MainLoopDriver,
                                                                                         ref ansiEscapeSequenceRequest
                                                                                        );

                                 tvResponse.Text = ansiEscapeSequenceRequest.AnsiEscapeSequenceResponse?.Response ?? "";
                                 tvError.Text = ansiEscapeSequenceRequest.AnsiEscapeSequenceResponse?.Error ?? "";
                                 tvValue.Text = ansiEscapeSequenceRequest.AnsiEscapeSequenceResponse?.ExpectedResponseValue ?? "";
                                 tvTerminator.Text = ansiEscapeSequenceRequest.AnsiEscapeSequenceResponse?.Terminator ?? "";

                                 if (success)
                                 {
                                     lblSuccess.ColorScheme = Colors.ColorSchemes ["Base"];
                                     lblSuccess.Text = "Success";
                                 }
                                 else
                                 {
                                     lblSuccess.ColorScheme = Colors.ColorSchemes ["Error"];
                                     lblSuccess.Text = "Error";
                                 }
                             };
    w.Add (btnResponse);

    w.Add (new Label { Y = Pos.Bottom (lblSuccess) + 2, Text = "Send other requests by editing the TextFields." });

    return w;
}

@tznind
Copy link
Collaborator Author

tznind commented Nov 28, 2024

If EscSeqUtils.DecodeEscSeq cannot recognize yet the ANSI escape sequence as valid, then the ConsoleKeyInfo []? IncompleteCkInfos will store it and insert at the beginning of next key get from the console

This is also how AnsiParser works. It uses a state machine such that no matter how split the input order is, the resolution remains the same.

I feel like we are duplicating work here, after this is merged we should work to empower the ansi parser and retire DecodeEscSeq and some of the other stuff in EscSeqUtils.

This diagram may help explain how it works (see below). The diagram is for v2 but the parser is basically implemented in the same way for the normal ConsoleDriver classes

  • Parser filters input stream - everything read from console goes straight to parser
    • Sometimes parser holds onto keys (see IHeld)
    • Sometimes parser releases multiple keys at once
    • Always parser has 'state' i.e. in response etc
  • Parser operates in the native type that each input stream is provided e.g. InputRecord for win and ConsoleKeyInfo for net
  • Parser can be managed in main loop to force release at any time
    • for example if see Esc then [ but its now 50 ms since see anything so is probably just user typing this sequence.
  • When parser releases keys they get processed downstream as normal e.g. Esc, [ would be released and processed exactly as if parser were not in the pipeline
    • This 'stream filtering' approach means parser doesnt have to raise key events etc it just realizes mistake and dropps the events back into input stream at next chance

image

This is the class I wrote that replaces all the complex mouse handling cki stuff with a single tiny class. Because AnsiParser has already isolated the response we can just operate on the complete string

public class AnsiMouseParser
{
    // Regex patterns for button press/release, wheel scroll, and mouse position reporting
    private readonly Regex _mouseEventPattern = new (@"\u001b\[<(\d+);(\d+);(\d+)(M|m)", RegexOptions.Compiled);

    public MouseEventArgs? ProcessMouseInput (string input)
    {
        // Match mouse wheel events first
        Match match = _mouseEventPattern.Match (input);

        if (match.Success)
        {
            int buttonCode = int.Parse (match.Groups [1].Value);

            // The top-left corner of the terminal corresponds to (1, 1) for both X (column) and Y (row) coordinates.
            // ANSI standards and terminal conventions historically treat screen positions as 1 - based.

            int x = int.Parse (match.Groups [2].Value) - 1;
            int y = int.Parse (match.Groups [3].Value) - 1;
            char terminator = match.Groups [4].Value.Single ();

            return new()
            {
                Position = new (x, y),
                Flags = GetFlags (buttonCode, terminator)
            };
        }

@BDisp
Copy link
Collaborator

BDisp commented Nov 28, 2024

I feel like we are duplicating work here, after this is merged we should work to empower the ansi parser and retire DecodeEscSeq and some of the other stuff in EscSeqUtils.

I agree but ensure that the EscSeqUtils unit tests wont fail with the AnsiParser.

  • for example if see Esc then [ but its now 50 ms since see anything so is probably just user typing this sequence.

  • When parser releases keys they get processed downstream as normal e.g. Esc, [ would be released and processed exactly as if parser were not in the pipeline

    • This 'stream filtering' approach means parser doesnt have to raise key events etc it just realizes mistake and dropps the events back into input stream at next chance

Normally check if the console has no key available then for sure it's a user typing a sequence, otherwise it's really a sequence.

This is the class I wrote that replaces all the complex mouse handling cki stuff with a single tiny class. Because AnsiParser has already isolated the response we can just operate on the complete string

Great.

@tznind
Copy link
Collaborator Author

tznind commented Nov 28, 2024

Normally check if the console has no key available [...].

We are broadly agreed :) , however this is actually one of the things I did away with.

What matters is the time the key was read at, not the available/not available state of the input pipe. Because a pipe could get momentarily blocked or anything at any time.

So this is why we have public DateTime StateChangedAt { get; private set; } so we can use realtime as a measure for bailing out of a response not the input pipeline state.

This setting is also controlled outside of the parser i.e. it is up to the user of the class (the driver_ to decide when to order a release. For example:

    public IEnumerable<WindowsConsole.InputRecord> ShouldReleaseParserHeldKeys ()
    {
        if (_parser.State == AnsiResponseParserState.ExpectingBracket &&
            DateTime.Now - _parser.StateChangedAt > EscTimeout)
        {
            return _parser.Release ().Select (o => o.Item2);
        }

        return [];
    }

@BDisp
Copy link
Collaborator

BDisp commented Nov 28, 2024

We are broadly agreed :) , however this is actually one of the things I did away with.

What matters is the time the key was read at, not the available/not available state of the input pipe. Because a pipe could get momentarily blocked or anything at any time.

But the user typing would not be lost, right?

So this is why we have public DateTime StateChangedAt { get; private set; } so we can use realtime as a measure for bailing out of a response not the input pipeline state.

But this bailing out of a response will be what the user was really typing, right? Sorry for these silly questions, but I'm learning about these new processing.

@tznind
Copy link
Collaborator Author

tznind commented Nov 28, 2024

You are exactly right, because parser holds the full input event object (T), when asked to release then those events can be fed into the downstream process as normal. So they surface as normal.

For example the returned T objects from the above method are passed to downstream processing here:

foreach (var k in ShouldReleaseParserHeldKeys ())
{
    ProcessMapConsoleKeyInfo (k);
}

@tznind tznind requested a review from tig November 30, 2024 11:19
@tznind tznind changed the title Fixes #2610 - Adds Ansi parser and scheduler. Fixes #3767 - Adds Ansi parser and scheduler. Dec 7, 2024
@tig tig merged commit 62641c8 into gui-cs:v2_develop Dec 7, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allowing any driver to request ANSI escape sequence with immediate response.
4 participants