Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow server to pass along a specific project-scope it wants to sync data for. #70342

Merged
merged 13 commits into from
Oct 12, 2023

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Oct 11, 2023

This information allows the host side to search far less when looking for the matches for the information the server wants.

Note: more PRs are still coming past this point.

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Oct 11, 2023
@CyrusNajmabadi CyrusNajmabadi changed the title WIP: More checksum work (cont1) Allow server to pass along a specific project-scope it wants to sync data for. Oct 12, 2023
@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review October 12, 2023 17:58
@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner October 12, 2023 17:58
@@ -27,7 +27,7 @@ private sealed class AssetProvider : AbstractAssetProvider
public AssetProvider(SerializationValidator validator)
=> _validator = validator;

public override async ValueTask<T> GetAssetAsync<T>(Checksum checksum, CancellationToken cancellationToken)
public override async ValueTask<T> GetAssetAsync<T>(ProjectId? hintProject, Checksum checksum, CancellationToken cancellationToken)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can pass this 'hint' along to tell the host where to search, isntead of searching everything.

Note: in a followup PR thsi becomes more generic, so we can also scope down to the document level as well (so it's trivial to get data for things like a document, without having to do a full solution sweep).

// It's possible not all all our projects have checksums. Specifically, we may have only been
// asked to compute the checksum tree for a subset of projects that were all that a feature needed.
if (projectState.TryGetStateChecksums(out var projectStateChecksums))
await projectStateChecksums.FindAsync(projectState, searchingChecksumsLeft, result, cancellationToken).ConfigureAwait(false);
}

ChecksumCollection.Find(state.AnalyzerReferences, AnalyzerReferences, searchingChecksumsLeft, result, cancellationToken);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by moving this above the projects, we don't bother searching into documents when someone is just trying to sync the solution-checksum-info in isolation.

var solutionChecksums = await GetAssetAsync<SolutionStateChecksums>(solutionChecksum, cancellationToken).ConfigureAwait(false);
var solutionAttributes = await GetAssetAsync<SolutionInfo.SolutionAttributes>(solutionChecksums.Attributes, cancellationToken).ConfigureAwait(false);
var solutionChecksums = await GetAssetAsync<SolutionStateChecksums>(hintProject: null, solutionChecksum, cancellationToken).ConfigureAwait(false);
var solutionAttributes = await GetAssetAsync<SolutionInfo.SolutionAttributes>(hintProject: null, solutionChecksums.Attributes, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no hints here (because this is solution info, not project info). However, with teh reordering we did in the solution-finding code, this will all be cheap. that's because we search the direct (non-project-child) info of the solution first, before recursing. So we'll always find this data in O(1) time, instead of O(solution) time.


return SolutionInfo.Create(
solutionAttributes.Id, solutionAttributes.Version, solutionAttributes.FilePath, projects.ToImmutableAndClear(), analyzerReferences).WithTelemetryId(solutionAttributes.TelemetryId);
}

public async Task<ProjectInfo?> CreateProjectInfoAsync(Checksum projectChecksum, CancellationToken cancellationToken)
{
var projectChecksums = await GetAssetAsync<ProjectStateChecksums>(projectChecksum, cancellationToken).ConfigureAwait(false);
var projectChecksums = await GetAssetAsync<ProjectStateChecksums>(hintProject: null, projectChecksum, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this is a place where passing a project-id would be better. a followup PR accomplishes this.


var attributes = await GetAssetAsync<ProjectInfo.ProjectAttributes>(projectChecksums.Info, cancellationToken).ConfigureAwait(false);
var projectId = projectChecksums.ProjectId;
var attributes = await GetAssetAsync<ProjectInfo.ProjectAttributes>(hintProject: projectId, projectChecksums.Info, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this PR, we don't know the projectID without this roundtrip. A future PR fixes that, so we have that info up front.

var serializableSourceText = await GetAssetAsync<SerializableSourceText>(documentSnapshot.Text, cancellationToken).ConfigureAwait(false);
var documentSnapshot = await GetAssetAsync<DocumentStateChecksums>(hintProject: projectId, documentChecksum, cancellationToken).ConfigureAwait(false);
var attributes = await GetAssetAsync<DocumentInfo.DocumentAttributes>(hintProject: projectId, documentSnapshot.Info, cancellationToken).ConfigureAwait(false);
var serializableSourceText = await GetAssetAsync<SerializableSourceText>(hintProject: projectId, documentSnapshot.Text, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this scopes us down to a particular project to find this data. but we still need to search the whole project to find the document info being asked for. a followup PR addresses this by allowing the doc-id to be passed along as well to narrow the search even further.

{
// The responsibility is on us (as per the requirements of RemoteCallback.InvokeAsync) to Complete the
// pipewriter. This will signal to streamjsonrpc that the writer passed into it is complete, which will
// allow the calling side know to stop reading results.
Exception? exception = null;
try
{
await WriteAssetsWorkerAsync(pipeWriter, solutionChecksum, checksums, cancellationToken).ConfigureAwait(false);
await WriteAssetsWorkerAsync(pipeWriter, solutionChecksum, hintProject, checksums, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a lot of hte change is just passing this data along.

@@ -477,7 +489,8 @@ private async Task<Project> UpdateDocumentAsync(TextDocument document, DocumentS
// changed text
if (oldDocumentChecksums.Text != newDocumentChecksums.Text)
{
var serializableSourceText = await _assetProvider.GetAssetAsync<SerializableSourceText>(newDocumentChecksums.Text, cancellationToken).ConfigureAwait(false);
var serializableSourceText = await _assetProvider.GetAssetAsync<SerializableSourceText>(
hintProject: document.Project.Id, newDocumentChecksums.Text, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a future change makes it so we can scope this down to a particular document, not jsut a project.

// This ensures that when we are trying to sync the projects referenced by a SolutionStateChecksums' instance
// that we don't unnecessarily walk all documents looking just for those.

foreach (var (projectId, projectState) in state.ProjectStates)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state.ProjectStates

I wonder if the hintProject case would be better by just creating an ImmutableDictionary with the one entry in it and enumerating over it in the two loops here, rather than iterating over all the project states

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this changes in a followup pr. i use the hint project to just index directly into state.ProjectStates.

@@ -19,28 +19,29 @@ internal abstract class AbstractAssetProvider
/// <summary>
/// return data of type T whose checksum is the given checksum
/// </summary>
public abstract ValueTask<T> GetAssetAsync<T>(Checksum checksum, CancellationToken cancellationToken);
public abstract ValueTask<T> GetAssetAsync<T>(ProjectId? hintProject, Checksum checksum, CancellationToken cancellationToken);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProjectId? hintProject,

If you see other hint data coming down the line, would it make sense for this to be a struct so everywhere piping it through wouldn't need to change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that happens in the followup :)

/// <summary>
/// This service provide a way to get roslyn objects from checksum
/// </summary>
internal sealed partial class AssetProvider(Checksum solutionChecksum, SolutionAssetCache assetCache, IAssetSource assetSource, ISerializerService serializerService)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial

Is there another part to this class that I'm not seeing in this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nm, I see that the ChecksumSynchronizer moved inside. Perhaps rename that file?

@@ -402,7 +412,8 @@ private async Task<Project> UpdateProjectInfoAsync(Project project, Checksum inf
lazyDocumentsToAdd ??= ImmutableArray.CreateBuilder<DocumentInfo>();

// we have new document added
var documentInfo = await _assetProvider.CreateDocumentInfoAsync(newDocumentChecksums.Checksum, cancellationToken).ConfigureAwait(false);
var documentInfo = await _assetProvider.CreateDocumentInfoAsync(
project.Id, newDocumentChecksums.Checksum, cancellationToken).ConfigureAwait(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

project.Id

nit: everywhere else seems to be name param'ing this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can fixup.

Copy link
Contributor

@ToddGrun ToddGrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@arunchndr arunchndr merged commit 210bd5a into dotnet:main Oct 12, 2023
@ghost ghost added this to the Next milestone Oct 12, 2023
@CyrusNajmabadi CyrusNajmabadi deleted the checksumCleanup6 branch October 12, 2023 21:34
@jjonescz jjonescz modified the milestones: Next, 17.9 P1 Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants