Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove double tree traversal in checksum syncing. #70302

Merged
merged 14 commits into from
Oct 12, 2023

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Oct 9, 2023

Note: more PRs to come. Just trying to break things into easy pieces.

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Oct 9, 2023
@@ -10,7 +10,7 @@ namespace Microsoft.CodeAnalysis.Remote
/// <summary>
/// Represents a part of solution snapshot along with its checksum.
/// </summary>
internal readonly struct SolutionAsset
internal sealed class SolutionAsset
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this moved from a product concept to a test-only concept.


public async Task<T> GetValueAsync<T>(Checksum checksum)
{
var data = await GetRequiredAssetAsync(checksum).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is all test code, which expects to always find these values. so it's ok for it to put in these contract calls and requirements.

var data = await scope.GetTestAccessor().GetAssetAsync(checksum, CancellationToken.None).ConfigureAwait(false);
Contract.ThrowIfNull(data);
return new(checksum, data);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same. also tests.

Assert.Equal(1, solutionObject.Projects.Count);
await validator.VerifySnapshotInServiceAsync(validator.ToProjectObjects(solutionObject.Projects)[0], 0, 0, 0, 0, 0).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be explained below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this was a move. it was previously a product concept, but i made it into a test-only concept.

@@ -39,7 +39,6 @@ IEnumerator IEnumerable.GetEnumerator()

public void AddAllTo(HashSet<Checksum> checksums)
{
checksums.AddIfNotNullChecksum(this.Checksum);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will also be explained below.

if (searchingChecksumsLeft.Remove(Projects.Checksum))
{
result[Projects.Checksum] = Projects;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this it he meat of the change. The prior logic was simply crazy, and deeply complex confusing. This it the logic that says "hey... you give me a checksum, and i will find the corresponding real object associated with it in the actual solution/project/document state.

except, lines like thsi were basically saying "oh... your checksum itself refers to a checksum object (like a SolutionChecksums/ProjectChecksums/DocumentChecksums instance), so i'll add my own checksum child back into the result.

This was one of the source of multiple passes and expensiveness with checksums. When looking up project info, we'd both recurse into child projects as real 'Project' instances, and then again as checksum instances. Insanity.

All of htat is gone now. we just pass in checksums, and we get real workspace objects back out.

Copy link
Member Author

@CyrusNajmabadi CyrusNajmabadi Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also what affected tests. which were testing a non-real scenario of asking for these checksums, then validating the checksum objects that were returned. all bogus things that don't reflect real scenarios.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an incoming PR that will make it so that this doesn't do the depth first search? (as you mentioned in the office the other day)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That comes in a follow-up

Debug.Assert(asset.Kind != WellKnownSynchronizationKind.Null, "We should not be sending null assets");
writer.WriteInt32((int)asset.Kind);
var kind = asset.GetWellKnownSynchronizationKind();
Contract.ThrowIfTrue(kind == WellKnownSynchronizationKind.Null);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: we have this contract virtually everywhere. But i'm still finding and stamping out code that is resilient to null being passed along.


await validator.VerifySynchronizationObjectInServiceAsync(solutionSyncObject).ConfigureAwait(false);

var solutionObject = await validator.GetValueAsync<SolutionStateChecksums>(checksum).ConfigureAwait(false);
await validator.VerifyChecksumInServiceAsync(solutionObject.Attributes, WellKnownSynchronizationKind.SolutionAttributes).ConfigureAwait(false);

var projectsSyncObject = await scope.GetTestAccessor().GetAssetAsync(solutionObject.Projects.Checksum, CancellationToken.None).ConfigureAwait(false);
await validator.VerifySynchronizationObjectInServiceAsync(projectsSyncObject).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be explained below.

internal enum WellKnownSynchronizationKind
{
Null,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this concept entirely. it was not allowed, and we had so many confusing checks everywhere around it.


foreach (var (checksum, value) in resultPool.Object)
{
result[checksum] = new SolutionAsset(checksum, value);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to lookup everything first as objects, then wrap as SolutinoAssets. all that wrapping/checking is now in the test layer only.

@CyrusNajmabadi CyrusNajmabadi changed the title WIP: Seeing what breaks. Remove double tree traversal in checksum syncing. Oct 12, 2023
@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review October 12, 2023 00:44
@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner October 12, 2023 00:44
@ToddGrun
Copy link
Contributor

        var value = values[i];

super nit: move inside condition


Refers to: src/Workspaces/Core/Portable/Workspace/Solution/ChecksumCollection.cs:84 in 7188f10. [](commit_id = 7188f10, deletion_comment = False)

@CyrusNajmabadi
Copy link
Member Author

super nit: move inside condition

Can do in followup

@CyrusNajmabadi
Copy link
Member Author

Seems like overkill to be checking in this loop, nothing slow going on in here

Can definitely tweak in follow-up:-)

Copy link
Contributor

@ToddGrun ToddGrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@CyrusNajmabadi CyrusNajmabadi merged commit 005c4c1 into dotnet:main Oct 12, 2023
24 checks passed
@ghost ghost added this to the Next milestone Oct 12, 2023
@CyrusNajmabadi CyrusNajmabadi deleted the checksumCleanup2 branch October 12, 2023 05:17
@CyrusNajmabadi
Copy link
Member Author

@ToddGrun i couldn't figure out what code a few of your comments were referring to. Could you clarify? I can incorporate the missing feedback into the next pr. thanks!

@CyrusNajmabadi
Copy link
Member Author

@ToddGrun still not clear what locations those refer to :-D

Dictionary<Checksum, object?> result,
CancellationToken cancellationToken)
Dictionary<Checksum, object> result,
CancellationToken cancellationToken) where T : class
{
Contract.ThrowIfFalse(values.Count == checksums.Children.Length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github doesn't seem to allow comments in sections of the file that weren't changed? For the two comments that were unclear, they were both in the loop below and were a bit nit-picky, so I don't care too much if they aren't taken:

  1. No need to check the cancellation token in the loop, nothing expensive going on in there.
  2. The .Count check could just be done inside the .Remove condition as it's status can't change unless the Remove condition succeeds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I can revise!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants