Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Interlocked in InputFlowControl and Http2Stream #57968

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ladeak
Copy link
Contributor

@ladeak ladeak commented Sep 19, 2024

Using Interlocked in InputFlowControl and Http2Stream

Based on the comments of #57236 implementing interlocked in InputFlowControl.

Description

In InputFlowControl class the state is encoded by FlowControlState. It uses a single long value that can be Interlocked.CompareExchanged.
The operations of the previous FlowControl type are now implemented by InputFlowControl type itself.

There are two types of performance tests I executed. One in part of the PR as a standard BDN tests, and one type using the following Tester class, which is based on the previous discussions of this PR.

public class Tester()
{
    internal readonly InputFlowControl _flowControl = new(10000, 10);
    private const int Spin = 50;
    private int total = 0;
    private bool counterEnabled = false;

    internal async Task Run(InputFlowControl control)
    {
        for (int i = 0; i < 8; i++)
        {
            total = 0;
            bool abort = false;
            control.Reset();
            var t1 = Task.Factory.StartNew(() =>
            {
                int counter = 0;
                while (!abort)
                {
                    if (counterEnabled) counter++;
                    control.TryUpdateWindow(16, out _);
                }
                Interlocked.Add(ref total, counter);
            }, TaskCreationOptions.LongRunning);

            void Consumer()
            {
                int counter = 0;
                while (!abort)
                {
                    if (counterEnabled) counter++;
                    if (control.TryAdvance(1))
                    {
                        for (int j = 0; j < Spin; j++)
                        {
                        }
                    }
                }
                Interlocked.Add(ref total, counter);
            }

            var t2 = Task.Factory.StartNew(Consumer, TaskCreationOptions.LongRunning);
            var t3 = Task.Factory.StartNew(Consumer, TaskCreationOptions.LongRunning);
            var t4 = Task.Factory.StartNew(Consumer, TaskCreationOptions.LongRunning);

            Thread.Sleep(300);

            control.Reset();
            Interlocked.Exchange(ref counterEnabled, true);
            Thread.Sleep(1000);
            Interlocked.Exchange(ref abort, true);
            await Task.WhenAll(t1, t2, t3, t4);
            Console.WriteLine($"{control.GetType().Name}: {total},");
        }
    }
}

In the results below:

  • InputFlowControlState uses Interlocked.
  • InputFlowControl uses lock.
  • Spin 50 can be increased or reduced, it is a placeholder for any other work that might happen in between the locked operations (ie. sending data on the H2 streams).
  • Notice, that the testing might be a bit tricky as if the FlowControl may run out of available window, in which case it throws an exception (regardless of the type of lock being used). On my machine the provided initial input window and update size (for TryUpdateWindow) was 'just' enough to not to run into this state but on different CPU adjusting these values might be necessary.
  • In Tester class the higher the counter number, the more operations were performed.
InputFlowControl: 8921766,
InputFlowControl: 11849858,
InputFlowControl: 11833087,
InputFlowControl: 11729804,
InputFlowControl: 11639123,
InputFlowControl: 11758042,
InputFlowControl: 11570816,
InputFlowControl: 11501363,
InputFlowControlState: 18889513,
InputFlowControlState: 21042510,
InputFlowControlState: 21810964,
InputFlowControlState: 22845625,
InputFlowControlState: 22820625,
InputFlowControlState: 22374220,
InputFlowControlState: 22050317,
InputFlowControlState: 24744676,

Executing the benchmarks with BDN (part of the PR):

BEFORE:

|                          Method |     Mean |    Error |   StdDev |  Op/s | Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------------------- |---------:|---------:|---------:|------:|------:|------:|------:|----------:|
| ThreadsAdvanceWithWindowUpdates | 75.01 ms | 1.471 ms | 2.972 ms | 13.33 |     - |     - |     - |      4 KB |
|                          Method |     Mean |    Error |   StdDev |  Op/s | Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------------------- |---------:|---------:|---------:|------:|------:|------:|------:|----------:|
| ThreadsAdvanceWithWindowUpdates | 73.27 ms | 1.465 ms | 3.594 ms | 13.65 |     - |     - |     - |      3 KB |
|                          Method |     Mean |    Error |   StdDev |  Op/s | Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------------------- |---------:|---------:|---------:|------:|------:|------:|------:|----------:|
| ThreadsAdvanceWithWindowUpdates | 86.33 ms | 1.682 ms | 2.667 ms | 11.58 |     - |     - |     - |      3 KB |

AFTER:

|                          Method |     Mean |    Error |   StdDev |   Median |  Op/s | Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------------------- |---------:|---------:|---------:|---------:|------:|------:|------:|------:|----------:|
| ThreadsAdvanceWithWindowUpdates | 23.74 ms | 2.717 ms | 8.011 ms | 19.71 ms | 42.13 |     - |     - |     - |      4 KB |
|                          Method |     Mean |    Error |   StdDev |   Median |  Op/s | Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------------------- |---------:|---------:|---------:|---------:|------:|------:|------:|------:|----------:|
| ThreadsAdvanceWithWindowUpdates | 23.73 ms | 2.831 ms | 8.346 ms | 18.65 ms | 42.13 |     - |     - |     - |      3 KB |
|                          Method |     Mean |    Error |   StdDev |  Op/s | Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------------------- |---------:|---------:|---------:|------:|------:|------:|------:|----------:|
| ThreadsAdvanceWithWindowUpdates | 26.97 ms | 3.037 ms | 8.954 ms | 37.08 |     - |     - |     - |      3 KB |

Fixes #56794

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Sep 19, 2024
Copy link
Contributor

Thanks for your PR, @ladeak. Someone from the team will get assigned to your PR shortly and we'll get it reviewed.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Sep 19, 2024
Copy link
Member

@amcasey amcasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit I'm curious whether this has any effect on a real benchmark.

Available -= bytes;
}

// bytes can be negative when SETTINGS_INITIAL_WINDOW_SIZE decreases mid-connection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading this comment correctly, allowing it to go negative is important. How can it be a uint now?

Copy link
Contributor Author

@ladeak ladeak Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I thought this is wrong, but actually it works because: https://sharplab.io/#v2:EYLgtghglgdgNAFxAJwK4wD4AEBMBGAWAChjYEACCcgXnIGYAGAbmIgFpqAWZ4rPATgAUgsgEpBAGwD2MAObjUYiKKZA

I am looking into making it more explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code and added tests to cover this use-case.

{
currentFlow = _flow; // Copy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be a volatile read or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have based the implementation on the documentation: https://learn.microsoft.com/en-us/dotnet/api/system.threading.interlocked.compareexchange?view=net-8.0 (see the examples)

I know there is another implementation, which honestly 'reads' safer to me, happy to update if that is preferred:

do
{
      startValue = currentValue; 
      desiredValue = f(startVal); 
      currentValue = Interlocked.CompareExchange(ref target, desiredValue, startValue); 
}
while (startValue != currentValue); 

}
} while (currentPendingSize != Interlocked.CompareExchange(ref _pendingUpdateSize, computedPendingUpdateSize, currentPendingSize));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the locking version, the same thread that updated _flow updated _pendingUpdateSize; now it seems like different threads could update them. It's not obvious to me whether that matters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell it does not, but I am not sure if I missed any use-case.

@amcasey
Copy link
Member

amcasey commented Sep 25, 2024

This seems to have been @halter73's proposal

@ladeak ladeak force-pushed the ladeak-56794-interlocked branch 2 times, most recently from c4580f7 to 0c39914 Compare September 26, 2024 18:35
@ladeak ladeak requested a review from amcasey September 26, 2024 18:38
Copy link
Contributor

Looks like this PR hasn't been active for some time and the codebase could have been changed in the meantime.
To make sure no conflicting changes have occurred, please rerun validation before merging. You can do this by leaving an /azp run comment here (requires commit rights), or by simply closing and reopening.

@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions community-contribution Indicates that the PR has been added by a community member pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use System.Threading.Lock throughout ASP.NET Core
4 participants