Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make getting a SourceText from an IDE parsed tree not hit the LOH. #73494

Merged
merged 22 commits into from
May 16, 2024

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented May 15, 2024

Takes us from:

image

To:

image

A drop of about 220MB LOH allocs.

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels May 15, 2024
private const int CharSegmentLength = 4096;

// 16k characters. Equivalent to 32KB in memory. comes from SourceText char buffer size and less than large object size
public const int SourceTextLengthThreshold = 32 * 1024 / sizeof(char);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: we can double this safely. If you're less than 32k chars you won't be going in the LOH either (since that triggers at 70k bytes).

@CyrusNajmabadi CyrusNajmabadi changed the title WIP: LOH allocs Make getting a SourceText from an IDE parsed tree not hit the LOH. May 15, 2024
@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review May 15, 2024 21:54
@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner May 15, 2024 21:54
@CyrusNajmabadi CyrusNajmabadi requested a review from ToddGrun May 15, 2024 21:54
@CyrusNajmabadi
Copy link
Member Author

@ToddGrun ptal.

(CSharpParseOptions)options,
FilePath,
Encoding,
_checksumAlgorithm);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wrapping. no change.

DirectCast(options, VisualBasicParseOptions),
FilePath,
Encoding,
_checksumAlgorithm))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wrapping. no change.

CSharpParseOptions options,
string filePath,
Encoding? encoding,
SourceHashAlgorithm checksumAlgorithm)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wrapping.

…rpSyntaxTreeFactoryService.ParsedSyntaxTree.cs
private const int CharArrayLength = 4 * 1024;

// 32k characters. Equivalent to 64KB in memory bytes. Will not be put into the LOH.
public const int SourceTextLengthThreshold = 32 * 1024;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in double thsi value from waht it was before.

Contract.ThrowIfTrue(chunks.Any(static (c, s) => c.Length != s, CharArrayLength));

using var chunkReader = new CharArrayChunkTextReader(chunks, totalLength);
var result = SourceText.From(chunkReader, totalLength, encoding, checksumAlgorithm);
Copy link
Contributor

@ToddGrun ToddGrun May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SourceText.From

It feels like there is an inefficiency here. This SourceText.From call takes the reader you have built up over these chunks, detects it's large, and will call LargeText.Decode, which will create chunks over the data.

I feel like we're jumping through hoops to do this at this level, when maybe it should be at the compiler level. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. but we have no efficient paths at all. We can simultaneously explore a different way of the compiler exposing things (like wrapping an ImmutableArray<ImmutableArray<char>> chunks system), but that can happen outdie of this PR.

Note: while any approach today involves multiple copies, this approach at least makes one of the copy non LOH and also pooled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

Contract.ThrowIfNull(value);

var valueSpan = value.AsSpan();
while (valueSpan.Length > 0)
Copy link
Contributor

@ToddGrun ToddGrun May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while (valueSpan.Length > 0)

feel free to ignore this, but I find something like the following a bit easier to read:

var index = 0;
foreach (var chunk in _chunks)
{
    var copyLength = Math.Min(value.Length - index, chunk.Length);
    value.CopyTo(index, chunk, 0, copyLength);

    index += chunk.Length;
}
``` #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems not correct since you're restarting at the first chunk each time a string is written in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, duh, multiple write calls

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, my current rewrite of this doesn't really look simpler than what you've got

Copy link
Contributor

@ToddGrun ToddGrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@CyrusNajmabadi CyrusNajmabadi merged commit f3dccaa into dotnet:main May 16, 2024
25 checks passed
@CyrusNajmabadi CyrusNajmabadi deleted the nodeToText branch May 16, 2024 00:46
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone May 16, 2024
@Cosifne Cosifne modified the milestones: Next, 17.11 P2 May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants