Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the size of the pipe #49270

Merged
merged 2 commits into from
Mar 8, 2021
Merged

Reduce the size of the pipe #49270

merged 2 commits into from
Mar 8, 2021

Conversation

davidfowl
Copy link
Member

@davidfowl davidfowl commented Mar 7, 2021

  • Use the pipe itself as the synchronization object
  • Store the options instance as a way to reference shared settings
  • Added a field to PipeOptions for storing if the Pool is the ArrayPool implementation of the MemoryPool
  • Reduce the size of PipeCompletion since triggering callbacks is deprecated
  • Change the size of the stack to 4 segments by default (which represents 16K of memory vs the 65K it is today). This also matches the pause writer threshold.

This makes it possible to share common options across multiple pipe instances and not pay that extra size cost.

Using the ObjectLayoutInspector:

Before
Size: 368 bytes. Paddings: 27 bytes (%7 of empty space)

After
Size: 256 bytes. Paddings: 22 bytes (%8 of empty space)
Size: 264 bytes. Paddings: 22 bytes (%8 of empty space)

Simulation of 1M connections using pipes:

var options = new PipeOptions(useSynchronizationContext: false);

for (int i = 0; i < 1_000_000; i++)
{
    var transportToApp = new Pipe(options);
    var appToTransport = new Pipe(options);

    static async Task DoRead1(PipeReader reader)
    {
        var result = await reader.ReadAsync();
        reader.AdvanceTo(result.Buffer.End);
    }

    static async Task DoRead2(PipeReader reader)
    {
        var result = await reader.ReadAsync();
        reader.AdvanceTo(result.Buffer.End);
    }

    _ = DoRead1(transportToApp.Reader);
    _ = DoRead2(appToTransport.Reader);

    _connections.Add((transportToApp, appToTransport));
}

Before:

Name Working Set
ConsoleApp28.exe 1,590,136 K

image

After:

Name Working Set
ConsoleApp28.exe 1,197,036 K

image

@davidfowl
Copy link
Member Author

Something is broken but I'm having a hard time running tests locally 😢

Copy link
Member

@benaadams benaadams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before 368 bytes. After 256 bytes

Nice! 30% saving, and options is likely a shared object by all the pipes so going via the ref should mean its in cache for all pipelines rather than being a different 48 byte block for each Pipe; and since PipeOptions is immutable, seem safe.

Does that mean SocketAsyncEventArgs is now top chonk at 320 bytes?

@davidfowl
Copy link
Member Author

Yep am looking into the per connection memory and SocketAsyncEventArgs is up there.

- Use the pipe itself as the synchronization object
- Store the options instance as a way to reference shared settings
- Added a field to PipeOptions for storing if the Pool is the ArrayPool implementation of the MemoryPool
- Shrink PipeAwaitable in the common case
  - Move the ExecutionContext and SynchronizationContext into a typed called the SchedulingContext. These types are mostly used with async await and it's extremely rare to have to capture any of this state.
- Shrink the size of PipeCompletion
  - Since completion callbacks are deprecated they are rarely set. We remove the pool and the other fields and just store a list (which should be rarely used now).
- Reduce the default segment pool size to 4 items = 16K buffered
  - The original size was optimized to avoid pool resizes but we need to balance idle memory and the potential resize cost of the resize.
@davidfowl
Copy link
Member Author

OK I'm done tweaking this thing.

private readonly bool _useSynchronizationContext;
// The options instance
private readonly PipeOptions _options;
private readonly object _sync = new object();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in no way saying it's worth it, but if you really wanted to avoid this new object() and you wanted to use that buffer segment array you tried to in a previous commit, you could instead still keep this field and just store that array into this field in the ctor. Then you're reusing the same array instance even the buffer stack is resized. There are downsides, though, namely you'd be keeping alive what I assume is a much larger object, and if that array referenced other stuff, anything it referenced.

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some build failures, but otherwise LGTM

@davidfowl
Copy link
Member Author

Is this crossgen issue a known thing?

@benaadams
Copy link
Member

Is this crossgen issue a known thing?

runtime-dev-innerloop (Build windows x86 release Runtime_Debug) is a timeout cull because it ran for more than 60mins; seen it pop up a few times on PRs and go away, looks like its on the timeout edge?

e.g. runs for 58mins on successful builds

image

@davidfowl davidfowl added this to the 6.0.0 milestone Apr 2, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants