Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dotnet] #10161 Allow new targets to wait until debugger is attached #10603

Closed
wants to merge 8 commits into from
Closed

Conversation

EdwinVanVliet
Copy link
Contributor

@EdwinVanVliet EdwinVanVliet commented May 3, 2022

Thanks for contributing to Selenium!
A PR well described will help maintainers to quickly review and merge it

Before submitting your PR, please check our contributing guidelines.
Avoid large PRs, help reviewers by making them as simple and short as possible.

Description

-Added TargetAttached event to Target domain in order to allow consumers of DevTools session to be informed about new targets.
-Added SendCommands function to DevToolsSession. This function is needed when using the Runtime.runIfWaitingForDebugger command in conjuntion with other commands. Using the singular SendCommand function will result in blocking issues.
-Added waitForDebugger argument to InitializeSession function DevToolsSession to enable/disable waiting for the debugger to attach. By default it's disabled because that's current behavior.

Motivation and Context

When a new target (new tab for instance) is created DevTools continues execution based on the WaitForDebugger property set in the SetAutoAttach feature. When the WaitForDebugger property is set to false DevTools continues without waiting the consumer to properly enable all domains on the target. For instance when being interested in network events from the network domain we need to invoke Network.enable for that specific target. The current behavior causes some of the network events to be missing due to Selenium enabling the domains after the target already performed network requests.

By allowing consumers to enable the WaitForDebugger property we can enable all domains and tell DevTools whenever we are ready by using the Run.runIfWaitingForDebugger command.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have read the contributing document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@CLAassistant
Copy link

CLAassistant commented May 3, 2022

CLA assistant check
All committers have signed the CLA.

@titusfortner
Copy link
Member

titusfortner commented May 9, 2022

Thanks for the contribution.

I'm not very experienced in either C# or in DevTools, but I'm trying to help out more in this part of the codebase, so a few things:

  1. Can you add tests to demonstrate behavior, or is this all race condition avoidance?
  2. When would you not want to wait for debugger?
  3. Is there a performance impact to waiting?
  4. Is there another downside to setting it to always be true?
  5. If we are toggling it, is false the correct default, or is that just consistent with current behavior?
  6. Why wouldn't we use a property here instead of having it in the constructor?
  7. The new data structures look nice, but this PR would be easier for me to merge if I don't also have to figure out the larger changes. Otherwise it might be a longer wait for our .NET expert to chime in.

@EdwinVanVliet
Copy link
Contributor Author

Hi @titusfortner

Thanks for your response.

The changes I made are mimicking the behavior of Puppeteer (https://github.com/puppeteer/puppeteer) and (https://github.com/microsoft/playwright).
Both those tools have have are waiting on new targets by default in order to setup all the domains correctly.

  1. Can you add tests to demonstrate behavior, or is this all race condition avoidance?
    => Can you elaborate on the race condition? All changes are related to be able to wait on the new targets before continueing execution.

  2. When would you not want to wait for debugger?
    => Strictly speaking it doesn't offer any advantages when only using a single tab for instance.

However in the current implementation invoking the runtime.runIfWaitingForDebugger should be done by the one implementing Selenium into his code.
This is done by subscribing to the Target.attachedToTarget event and invoking the runtime.runIfWaitingForDebugger command.

For instance this is the code I use for that in my own codebase;

    private void TargetAttachedToTarget(object sender, AttachedToTargetEventArgs e)
    {
        if (e != null && e.TargetInfo != null)
        {
            var commands = new List<DevToolsCommandSettings>();

            //enable domains for page targets to enable generating CDP events.
            if (string.Equals("page", e.TargetInfo.Type, StringComparison.InvariantCultureIgnoreCase))
            {
                commands.Add(new DevToolsCommandSettings("Page.enable") { SessionId = e.SessionId, CommandParameters = CreateCommand("Page.enable") });
                commands.Add(new DevToolsCommandSettings("Network.enable") { SessionId = e.SessionId, CommandParameters = CreateCommand("Network.enable") });
                commands.Add(new DevToolsCommandSettings("Runtime.enable") { SessionId = e.SessionId, CommandParameters = CreateCommand("Runtime.enable") });
            }

            if (e.WaitingForDebugger)
            {
                commands.Add(new DevToolsCommandSettings("Runtime.runIfWaitingForDebugger") { SessionId = e.SessionId, CommandParameters = CreateCommand("Runtime.runIfWaitingForDebugger") });
            }

            if (commands.Any())
            {
                //we use Task.Run to prevent blocking other events from Devtools
                Task.Run(() => _session.SendCommands(commands, millisecondsTimeout: (int)TimeSpan.FromSeconds(5).TotalMilliseconds, throwExceptionIfResponseNotReceived: false)));
            }
        }
    }
  1. Is there a performance impact to waiting?
    => Handling the target.AttachedToTarget event and invoking the runtime.runIfWaitingForDebugger usually takes around 25 ms.

  2. Is there another downside to setting it to always be true?
    => By doing this we force everyone to implements Selemium into handling the target.AttachedToTarget event himself.
    Because ServiceWorkers or PDF files are also targets, those will be halted when setting it to true.

This might not be ideal for everyone just wanting an out of the box solution when consuming the nuget package.

We could also look towards having a default implementation for this, but allowing an override to support for advanced scenarios.

  1. If we are toggling it, is false the correct default, or is that just consistent with current behavior?
    => I made this default behavior because it is current behavior.

  2. Why wouldn't we use a property here instead of having it in the constructor?
    => I couldn't think of any usecase in which you want to switch this behavior after initialization.
    However I'm also fine with if we extend the DevtoolsSession class with a new function in order to set this behavior

  3. The new data structures look nice, but this PR would be easier for me to merge if I don't also have to figure out the larger changes. Otherwise it might be a longer wait for our .NET expert to chime in.
    => Can you elaborate what changes you mean exactly?
    The SendCommands function is a big chunk and is primarily needed for enabling domains on the new target as shown in the above code.

@EdwinVanVliet
Copy link
Contributor Author

@titusfortner

Hi,

Did you have any change to look at my reply?
#10603 (comment)

With kind regards

Copy link
Member

@titusfortner titusfortner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for answering my questions, I understand this (a bit) better now.

Ideally there would be exhaustive tests showing how existing code works, and then you could add some tests showing how new code works. As it is, the tests we have that I can get to run aren't exercising the code you're changing, and I can't tell if the public tasks changing parameters is breaking existing functionality that users might currently use be relying on. Do we need to add a task instead of changing it for backwards compatibility?

Also, this needs to be rebased on the latest trunk for the versioned targets. Thanks.

/// <summary>
/// Initializes a new instance of the DevToolsSession class, using the specified WebSocket endpoint.
/// </summary>
/// <param name="endpointAddress"></param>
public DevToolsSession(string endpointAddress)
/// <param name="waitForDebuggerOnStart">When enabled new targets will be waiting until runtime.runIfWaitingForDebugger is invoked.</param>
public DevToolsSession(string endpointAddress, bool waitForDebuggerOnStart)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should waitForDebuggerOnStart remain an optional parameter for backwards compatibility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right on this one, I will make a commit for this.

@@ -194,7 +201,7 @@ public async Task<ICommandResponse<TCommand>> SendCommand<TCommand>(TCommand com
/// <param name="throwExceptionIfResponseNotReceived"><see langword="true"/> to throw an exception if a response is not received; otherwise, <see langword="false"/>.</param>
/// <returns>The command response object implementing the <see cref="ICommandResponse{T}"/> interface.</returns>
//[DebuggerStepThrough]
public async Task<JToken> SendCommand(string commandName, JToken commandParameters, CancellationToken cancellationToken = default(CancellationToken), int? millisecondsTimeout = null, bool throwExceptionIfResponseNotReceived = true)
public async Task<DevToolsCommandResponse> SendCommand(DevToolsCommandSettings devToolsCommandSettings, CancellationToken cancellationToken = default(CancellationToken), int? millisecondsTimeout = null, bool throwExceptionIfResponseNotReceived = true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change backwards compatible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've re-introduced this public function to be backwards compatible.

/// <param name="cancellationToken">A CancellationToken object to allow for cancellation of the command.</param>
/// <param name="millisecondsTimeout">The execution timeout of the command in milliseconds.</param>
/// <param name="throwExceptionIfResponseNotReceived"><see langword="true"/> to throw an exception if a response is not received; otherwise, <see langword="false"/>.</param>
/// <returns>The command response object implementing the <see cref="ICommandResponse{T}"/> interface.</returns>
Task<JToken> SendCommand(string commandName, JToken @params, CancellationToken cancellationToken, int? millisecondsTimeout, bool throwExceptionIfResponseNotReceived);
Task<DevToolsCommandResponse> SendCommand(DevToolsCommandSettings devToolsCommandSettings, CancellationToken cancellationToken, int? millisecondsTimeout, bool throwExceptionIfResponseNotReceived);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backwards compatible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've re-introduced this public function to be backwards compatible.

@diemol
Copy link
Member

diemol commented Oct 24, 2023

I think this is not needed anymore given the DevTools revamp in .NET, right @nvborisenko?

@nvborisenko
Copy link
Member

@diemol, I quickly walked through this and, seems, this is still actual, and the issue is more global in Selenium across all bindings. I am not expert in CDP, but it looks like Puppeteer and Playwright really use Runtime.runIfWaitingForDebugger command at some point in communication with CDP, while we don't use it at all.

As I understood we can reproduce it via:

- Start devtools session with network monitoring
- Click link which opens new tab in browser

Result: selenium is not able to monitor network in new tab.

@diemol
Copy link
Member

diemol commented Oct 25, 2023

This is CDP only, I guess we need to decide if we want to work on this or focus on BiDi.

@EdwinVanVliet
Copy link
Contributor Author

@nvborisenko

You are right about this issue still being actual. Like you said, Selenium is not able to monitor the network in time when opening a new tab in it's current setup. Usually the first navigate after opening a tab will be missing because the network domain is not enabled yet.

To accomplish this a couple of things needs to happen;

  • We need to enable Target.setAutoAttach and Target.setDiscoverTargets at the end of the InitializeSession function of DevToolsSession.cs.
    Note that we do this with an empty session id.
            // The Target domain needs to send Sessionless commands! Else the waitFordebugger setting in setAutoAttach wont work!
            await SendCommand(new DevToolsCommandSettings("Target.setAutoAttach") { SessionId = string.Empty, CommandParameters = CreateSetAutoAttachCommand() }).ConfigureAwait(false);
            await SendCommand(new DevToolsCommandSettings("Target.setDiscoverTargets") { SessionId = string.Empty, CommandParameters = CreateSetDiscoverTargets() }).ConfigureAwait(false);
  • On the Target.setAutoAttach command we need to enable waitForDebuggerOnStart. This causes Chromium to halt new targets until the Runtime.runIfWaitingForDebugger is invoked.
    In my PR I injected the waitForDebuggerOnStart property using the constructor of DevToolsSession.cs.
        private JObject CreateSetAutoAttachCommand()
        {
            var jobject = new JObject();
            jobject["CommandName"] = "Target.setAutoAttach";
            jobject["autoAttach"] = true;
            jobject["waitForDebuggerOnStart"] = waitForDebuggerOnStart;
            jobject["flatten"] = true;
            return jobject;
        } 

One tricky thing is that Chromium also halts all commands for a specific target until Runtime.runIfWaitingForDebugger is invoked. Because of this I introduced the SendCommands function which is able to send multiple commands without them blocking eachother. We might be able to get rid of the SendCommands function because of your changes, but I haven't tested this yet.

  • Note that I'm struggling with getting the testsuite to build/work. Could you help out with this?

What do you think about this? I can rebase the PR and try to get it to work without the SendCommands function.
That would greatly reduce the changeset of this PR.

@titusfortner
Copy link
Member

Note that I'm struggling with getting the testsuite to build/work. Could you help out with this?

We have an in-person working session in 2 weeks and an important piece of that will be to improve the .NET/Windows build experience.

@nvborisenko
Copy link
Member

@EdwinVanVliet if you are interested in the implementation, please go ahead. I see you do :)

From my perspective we can split this work to smaller slices:

  • Low level api
    It is strange to see code here like jobject["CommandName"] = "Target.setAutoAttach"; I believe there is strongly typed interface how to communicate with browser.
    As I understand you resolved somehow that command might sessionless. It is another problem (separate PR?).

  • Middle level (DevToolsSession.cs)
    Here we should think about what is API should be to support your case (handling something in different target). How dev tools session should work with targets, how it is should be initiated. Because, I guess, we cannot just add new parameter in constructor. It is risky and it opens a door to the hell with big number of parameters.

  • High level
    For end user how it will work.. In case of network monitoring case, should selenium "expose" network events from all tabs? Or user should explicitly switch to a tab which is under interest.

It would be easier for all us to land the fix step by step (separate PRs?).

Note: current .net setup is not "friendly" for me, and I also (hopefully temporary) cannot execute tests in a way how I do it for years. Locally I created new test project outside this git repo, and added reference to WebDriver project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants