Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #2514 - Use optimistic concurrency to prevent multiple UpdateIsLatest calls on same package #3548

Merged
merged 15 commits into from
Feb 28, 2017

Conversation

chenriksson
Copy link
Member

Approach is same as discussed, but implementation is more complex due to our need to restrict the optimistic concurrency check to CUD operations only on specific columns - which EF doesn't support.

@@ -253,7 +266,8 @@ where VersionRange.Parse(d.VersionSpec).Satisfies(packageVersion)

await PublishPackageAsync(package, commitChanges);
}


// caller is responsible for calling UpdateIsLatestAsync in separate transaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be xmldoc so callers actually see it. Put it on the interface.

var altEntitiesContext = CreateNewEntitiesContext();

// suspend retry execution strategy which does not support user initiated transactions
EntitiesConfiguration.SuspendExecutionStrategy = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is altEntitiesContext.GetDatabase(); throws? Won't this static stay true? In these cases, I like making an IDisposable which toggles the static on and off.

Another concern -- won't this effect other threads?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't affect other threads. SuspendExecutionStrategy uses CallContext which is similar to thread local storage. Note that this is also used in PackageDeleteService, which also uses custom transactions.

I wasn't thinking about GetDatabase throwing, but perhaps it could if the connection was somehow closed. I'll just expand the scope of the try/finally, which I think will be less work than introducing a disposable.


var packageRegistration = sets.First().Package.PackageRegistration;

if (retryCount++ < UpdateIsLatestMaxRetries)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please increment on a separate line. This is less readable.

}
else
{
_trace.Error(String.Format("UpdateIsLatestAsync retry exceeded for package '{0}'", packageRegistration.Id));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is really bad, right? What will the playbook be if we encounter one of these exceptions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd hit this block if the request detected 3 concurrency conflicts in a row, in which case it would not update the flags.

This is unlikely to be a problem, since the real package CUD was already committed in a separate request. More than likely a concurrent request has already updated isLatest/isLatestStable, and the retry would detect that no changes are necessary.

This trace would at least give us monitoring if there are multiple concurrency conflicts in a row. Playbook would be to just check the package to see if the latest is set correctly... but more than likely it corrected itself.

@joelverhagen
Copy link
Member

I'm interested in what a functional test on this change would look like. Before the change, could we have written a test to consistently reproduce the issue (by mocking out as much as possible and throwing enough threads at it)?

@@ -393,6 +393,9 @@ public virtual Task<ActionResult> CreatePackagePost()
throw;
}

// handle in separate transaction because of concurrency check with retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: capital letter in the beginning of the comment to match the rest of the file


protected internal virtual IEntitiesContext CreateNewEntitiesContext()
{
var connectionString = _entitiesContext.GetDatabase().Connection.ConnectionString;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use the connection string from web.config: https://github.com/NuGet/NuGetGallery/blob/master/src/NuGetGallery.Core/Entities/EntitiesContext.cs#L25

return new EntitiesContext() should do the trick

pv.IsLatest = false;
pv.IsLatestStable = false;
pv.LastUpdated = DateTime.UtcNow;
packageClears.Add(UpdateIsLatestPackageEdit.Set(pv, false, false));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use named parameters, for readability

@skofman1
Copy link
Contributor

                // a release version and set it to IsLatest.

isLatestStable.


Refers to: src/NuGetGallery/Services/PackageService.cs:835 in 7f861cd. [](commit_id = 7f861cd, deletion_comment = False)

if ((retryCount == 0) && (clears != null))
{
MergeUpdateIsLatestClearsWithSets(clears, sets);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove conditional around MergeUpdateIsLatestClearsWithSets. Originally I had bug where retry was calling back into UpdateIsLatestInDatabaseAsync without refresh and recalculating the latest.

@@ -779,5 +898,40 @@ private void NotifyIndexingService()
}
}
}

private class UpdateIsLatestPackageEdit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to add few comments about this class.

return database.ExecuteSqlCommandAsync(sql, parameters);
}

private async Task<int> UpdateIsLatestInDatabaseWithConcurrencyCheckAsync(Database database, UpdateIsLatestPackageEdit packageEdit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: line too long.

packageEdit.Package.Key, packageEdit.Package.IsLatest, packageEdit.Package.IsLatestStable, packageEdit.OriginalIsLatest, packageEdit.OriginalIsLatestStable);
}

private static bool PackagesMatch(Package first, Package second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe PackagesVersionMatch instead of the PackagesMatch?

}
}

private static void MergeUpdateIsLatestClearsWithSets(List<UpdateIsLatestPackageEdit> clears, List<UpdateIsLatestPackageEdit> sets)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have few comments. Also the line length seems to be large.

return database.ExecuteSqlCommandAsync(sql, parameters);
}

private async Task<int> UpdateIsLatestInDatabaseWithConcurrencyCheckAsync(Database database, UpdateIsLatestPackageEdit packageEdit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int [](start = 27, length = 3)

returning true/false will make the code more readable.


if (retryCount++ < UpdateIsLatestMaxRetries)
{
await Task.Delay(retryCount * 500);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

500 [](start = 62, length = 3)

const

if (retryCount++ < UpdateIsLatestMaxRetries)
{
await Task.Delay(retryCount * 500);
await UpdateIsLatestAsync(packageRegistration, retryCount);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpdateIsLatestAsync [](start = 38, length = 19)

this code could be simpler if the retries were done in UpdateIsLatestAsync. In this case UpdateIsLatestInDatabaseAsync calls the method that called it (UpdateIsLatestAsync), that calls it again, creating a recursion..
If UpdateIsLatestInDatabaseAsync returned a failure, UpdateIsLatestAsync can call it again until retries are exhausted.

@skofman1
Copy link
Contributor

        public async Task WillUpdateIsLatest1()

This naming is extremely confusing. Could you rename those?


Refers to: tests/NuGetGallery.Facts/Services/PackageServiceFacts.cs:1074 in 7f861cd. [](commit_id = 7f861cd, deletion_comment = False)

@skofman1
Copy link
Contributor

Regarding functional tests: Scott wrote a test that pushes a bunch of packages and verified timestamps:

public async Task PackagesAppearInFeedInOrderTest()

You could do something similar

}

private Task UpdateIsLatestAsync(PackageRegistration packageRegistration, int retryCount)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the retryCount used in this method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, on retry of concurrency conflict in UpdateIsLatestInDatabaseAsync


private static bool PackagesMatch(Package first, Package second)
{
if (first == null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be more concise and readable if you changed this to...

if (first == null || second == null)
{
    return first != second;
}

return first.Version.Equals(second.Version, StringComparison.OrdinalIgnoreCase);


public Package Package { get; private set; }

public bool OriginalIsLatest { get; set; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and OriginalIsLatestStable should have a private set because they should not be changed after being initialized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These may change in MergeUpdateIsLatestClearsWithSets

@@ -678,28 +685,149 @@ private void ValidatePackageTitle(PackageMetadata packageMetadata)
}
}

public async Task UpdateIsLatestAsync(PackageRegistration packageRegistration, bool commitChanges = true)
protected internal virtual Task<int> ExecuteSqlCommandAsync(Database database, string sql, params object[] parameters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge this function with UpdateIsLatestInDatabaseWithConcurrencyCheckAsync? The code should fit inside one function just fine since we don't use it anywhere else.

@@ -470,6 +473,13 @@ private static ActionResult BadRequestForExceptionMessage(Exception ex)
}

await PackageService.MarkPackageUnlistedAsync(package);

// handle in separate transaction because of concurrency check with retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize this comment just like you did this one and also add a period to the end (if the rest of the file does that).

@@ -470,6 +473,13 @@ private static ActionResult BadRequestForExceptionMessage(Exception ex)
}

await PackageService.MarkPackageUnlistedAsync(package);

// handle in separate transaction because of concurrency check with retry
if (package.IsLatest || package.IsLatestStable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest putting UpdateIsLatestAsync back inside MarkPackageListedAsync but after the commit. You always call the two together.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment for MarkPackageUnlistedAsync. While this method also supports not committing changes, only test code does it. So, I could remove that option and move UpdateIsLatestAsync back inside, but I'm inclined to keep these methods consistent. Let me know if you disagree.

/// <summary>
/// Updates IsLatest/IsLatestStable flags after a package CUD operation.
///
/// Database updates are applied on a separate context to better control refresh of entities when
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if this comment were more specific. We want better control BECAUSE we were having issues with multiple requests operating on the isLatest fields of the package simultaneously and leading to multiple packages being marked as the latest or the latest prerelease.

{
return;
}

// TODO: improve setting the latest bit; this is horrible. Trigger maybe?
// performed updates on separate context in order to refresh entities between retries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize please.

{
EntitiesConfiguration.SuspendExecutionStrategy = false;

var disposableContext = altEntitiesContext as IDisposable;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move the declaration of altEntitiesContext inside the try-finally and then use a using statement instead of having this in the finally statement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and you don't need to cast. DbContext implements IDisposable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to support the Mock object used by tests, which wasn't IDisposable. Will see if there's another way.

packageRepository.Verify(r => r.CommitChangesAsync(), Times.Never());
}

[Fact]
public async Task CommitIfCommitChangesIsTrue()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming this to AlwaysCommitsChanges because the commit changes option was removed from the function.

}

[Fact]
public async Task WillUpdateIsLatest3()
Copy link
Contributor

@scottbommarito scottbommarito Feb 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add some unit tests that test the behavior when transactions fail, and enumerate different possible scenarios. You can do this by mocking the UpdateIsLatestInDatabaseWithConcurrencyCheckAsync method and performing different behaviors based on the UpdateIsLatestPackageEdit object supplied.

return new ExecutionStrategySuspension();
}

private class ExecutionStrategySuspension : IDisposable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RetriableExecutionStrategySuspension

@skofman1
Copy link
Contributor

Looks great! :shipit:

{
get
{
return (bool?)CallContext.LogicalGetData("SuspendExecutionStrategy") ?? false;
return (bool?)CallContext.LogicalGetData("UseRetriableExecutionStrategy") ?? false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use nameof(UseRetriableExecutionStrategy).

Also, as part of the logic inversion, the fallback value here should be true not false.

@@ -581,6 +581,8 @@ public class TheEditMethod
.Throws(new Exception("Shouldn't be called"));
packageService.Setup(svc => svc.FindPackageByIdAndVersion("Foo", "1.0", true))
.Returns(package).Verifiable();
packageService.Setup(svc => svc.UpdateIsLatestAsync(It.IsAny<PackageRegistration>()))
.Returns(Task.FromResult(0)).Verifiable();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Task.CompletedTask also work? I think it's more clear than Task.FromResult(0).

@@ -1607,6 +1609,8 @@ public class TheVerifyPackageActionForPostRequests
.Returns(Task.CompletedTask);
fakePackageService.Setup(x => x.MarkPackageUnlistedAsync(fakePackage, false))
.Returns(Task.CompletedTask);
fakePackageService.Setup(x => x.UpdateIsLatestAsync(It.IsAny<PackageRegistration>()))
.Returns(Task.FromResult(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Task.CompletedTask?

}

[Fact]
public async Task UpdateIsLatestAsync_SameLatestAndLatestStableVersionsWithClear()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SameNewLatestAndLatestStableVersions sounds more clear than SameLatestAndLatestStableVersionsWithClear.

@chenriksson
Copy link
Member Author

@scottbommarito look ok to you?

WriteLine($"Package data does not contain '{propertyName}' timestamp!");
return null;
}
var propertiesStart = Math.Max(packageResponse.IndexOf("<m:properties>"), 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use XmlDocument or XDocument to read the XML?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that using XmlDocument or XDocument would also remove the potential for the regex to take an extremely long time as @blowdart noted.


var timestampStartIndex = timestampTagIndex + timestampStartTag.Length;
var timestampLength = packageResponse.Substring(timestampStartIndex).IndexOf(timestampEndTag);
var propertyRegEx = new Regex(GetPackagePropertyRegexPattern);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure there's a timeout on all regexs

Copy link
Contributor

@scottbommarito scottbommarito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple nitpicks

@@ -51,9 +50,9 @@ public class PackagesControllerFacts
if (uploadFileService == null)
{
uploadFileService = new Mock<IUploadFileService>();
uploadFileService.Setup(x => x.DeleteUploadFileAsync(It.IsAny<int>())).Returns(Task.FromResult(0));
uploadFileService.Setup(x => x.DeleteUploadFileAsync(It.IsAny<int>())).Returns(Task.CompletedTask);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for doing this...so much cleaner

indexingService.Object,
packageNamingConflictValidator,
auditingService);

packageService.Setup(s => s.TryUpdateIsLatestInDatabase(It.IsAny<IEntitiesContext>()))
.Returns(Task.FromResult(true));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Task.CompletedTask?

Copy link
Member Author

@chenriksson chenriksson Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only switched to Task.CompletedTask for async methods w/o return values. Since TryUpdateIsLatestInDatabase returns Task<bool>, I want to return true to look like the db update was successful.

WriteLine($"Package data does not contain '{propertyName}' timestamp!");
return null;
}
var propertiesStart = Math.Max(packageResponse.IndexOf("<m:properties>"), 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that using XmlDocument or XDocument would also remove the potential for the regex to take an extremely long time as @blowdart noted.

}
Task.WaitAll(concurrentTasks);

await CheckPackageLatestVersions(packageId, packageVersions, expectedLatest: "7.0.2-abc", expectedLatestStable: "7.0.1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer that you had these values in a variable instead of copying them here.

var latestVersion = "7.0.2-abc"
var latestStable = "7.0.1"

var latestVersionHalf = "3.0.2-abc"
var latestStableHalf = "3.0.1"

var packageVersions = new List<string>()
{
    "1.0.0-a",  "1.0.0-b",  "1.0.0",  "1.0.1",  "1.0.2-abc",
    "2.0.0-a",  "2.0.0-b",  "2.0.0",  "2.0.1",  "2.0.2-abc",
    "3.0.0-a",  "3.0.0-b",  "3.0.0",  latestStableHalf,  latestVersionHalf,
    "4.0.0-a",  "4.0.0-b",  "4.0.0",  "4.0.1",  "4.0.2-abc",
    "6.0.0-a",  "6.0.0-b",  "6.0.0",  "6.0.1",  "6.0.2-abc",
    "7.0.0-a",  "7.0.0-b",  "7.0.0",  latestStable,  latestVersion
};
...
await CheckPackageLatestVersions(packageId, packageVersions, latestVersion, latestStable);
...
// unlist last half and verify; ~1-2 concurrency conflicts seen in testing
for (int i = concurrentTasks.Length - 1; i > concurrentTasks.IndexOf(latestVersionHalf); i--)
...
await CheckPackageLatestVersions(packageId, packageVersions, latestVersionHalf, latestStableHalf);

// unlist remaining and verify; ~1-2 concurrency conflicts seen in testing
for (int i = concurrentTasks.IndexOf(latestVersionHalf); i >= 0; i--)

private async Task CheckPackageTimestampsInOrder(List<string> packageIds, string timestampPropertyName,
DateTime operationStartTimestamp)
DateTime operationStartTimestamp, bool packagesListed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that you're adding this to my test as well as yours, but this parameter is unnecessary because you can determine whether or not the packages are listed based on the feed itself. If the package has a Published property that is greater than DateTime.MinValue (pretty sure it's January 1st 1900) then it is listed. You can parse this and then determine whether or not it should be latest or latest stable. I tried to do this initially but didn't realize that Published represented the listed status in the feed.

You could of course change this into a "expectListed" parameter that verifies that the listed status from the feed is identical to what we expect.

@@ -795,6 +795,7 @@ protected internal virtual IEntitiesContext CreateNewEntitiesContext()
int retryCount = 1;
do
{
await Task.Delay(retryCount * 250);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to reduce the chance of retry collisions, I would recommend introducing randomization into the sleep duration

@chenriksson chenriksson force-pushed the chenriks/islatest-optcon branch from 6d787b1 to 315179b Compare February 28, 2017 17:48
@chenriksson chenriksson merged commit 27ea9bc into dev Feb 28, 2017
@chenriksson chenriksson deleted the chenriks/islatest-optcon branch February 28, 2017 19:12
chenriksson added a commit that referenced this pull request Mar 25, 2017
…dateIsLatest calls on same package (#3548)"

This reverts commit 27ea9bc.
chenriksson added a commit that referenced this pull request Mar 27, 2017
* Revert "UpdateIsLatest concurrent unlist fix (#3695)"

This reverts commit 551fd86.

* Revert "Fix concurrent push test by disabling search hijacking on feed (#3641)"

This reverts commit 00fb3fb.

* Revert "IsLatest Fix: wrong connection string passed to retry context (#3632)"

This reverts commit 1b6b5b0.

* Revert "Fix #2514 - Use optimistic concurrency to prevent multiple UpdateIsLatest calls on same package (#3548)"

This reverts commit 27ea9bc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants