-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #2514 - Use optimistic concurrency to prevent multiple UpdateIsLatest calls on same package #3548
Conversation
@@ -253,7 +266,8 @@ where VersionRange.Parse(d.VersionSpec).Satisfies(packageVersion) | |||
|
|||
await PublishPackageAsync(package, commitChanges); | |||
} | |||
|
|||
|
|||
// caller is responsible for calling UpdateIsLatestAsync in separate transaction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be xmldoc so callers actually see it. Put it on the interface.
var altEntitiesContext = CreateNewEntitiesContext(); | ||
|
||
// suspend retry execution strategy which does not support user initiated transactions | ||
EntitiesConfiguration.SuspendExecutionStrategy = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is altEntitiesContext.GetDatabase();
throws? Won't this static stay true
? In these cases, I like making an IDisposable
which toggles the static on and off.
Another concern -- won't this effect other threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't affect other threads. SuspendExecutionStrategy uses CallContext which is similar to thread local storage. Note that this is also used in PackageDeleteService, which also uses custom transactions.
I wasn't thinking about GetDatabase throwing, but perhaps it could if the connection was somehow closed. I'll just expand the scope of the try/finally, which I think will be less work than introducing a disposable.
|
||
var packageRegistration = sets.First().Package.PackageRegistration; | ||
|
||
if (retryCount++ < UpdateIsLatestMaxRetries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please increment on a separate line. This is less readable.
} | ||
else | ||
{ | ||
_trace.Error(String.Format("UpdateIsLatestAsync retry exceeded for package '{0}'", packageRegistration.Id)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is really bad, right? What will the playbook be if we encounter one of these exceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd hit this block if the request detected 3 concurrency conflicts in a row, in which case it would not update the flags.
This is unlikely to be a problem, since the real package CUD was already committed in a separate request. More than likely a concurrent request has already updated isLatest/isLatestStable, and the retry would detect that no changes are necessary.
This trace would at least give us monitoring if there are multiple concurrency conflicts in a row. Playbook would be to just check the package to see if the latest is set correctly... but more than likely it corrected itself.
I'm interested in what a functional test on this change would look like. Before the change, could we have written a test to consistently reproduce the issue (by mocking out as much as possible and throwing enough threads at it)? |
@@ -393,6 +393,9 @@ public virtual Task<ActionResult> CreatePackagePost() | |||
throw; | |||
} | |||
|
|||
// handle in separate transaction because of concurrency check with retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: capital letter in the beginning of the comment to match the rest of the file
|
||
protected internal virtual IEntitiesContext CreateNewEntitiesContext() | ||
{ | ||
var connectionString = _entitiesContext.GetDatabase().Connection.ConnectionString; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use the connection string from web.config: https://github.com/NuGet/NuGetGallery/blob/master/src/NuGetGallery.Core/Entities/EntitiesContext.cs#L25
return new EntitiesContext() should do the trick
pv.IsLatest = false; | ||
pv.IsLatestStable = false; | ||
pv.LastUpdated = DateTime.UtcNow; | ||
packageClears.Add(UpdateIsLatestPackageEdit.Set(pv, false, false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use named parameters, for readability
if ((retryCount == 0) && (clears != null)) | ||
{ | ||
MergeUpdateIsLatestClearsWithSets(clears, sets); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to remove conditional around MergeUpdateIsLatestClearsWithSets. Originally I had bug where retry was calling back into UpdateIsLatestInDatabaseAsync without refresh and recalculating the latest.
@@ -779,5 +898,40 @@ private void NotifyIndexingService() | |||
} | |||
} | |||
} | |||
|
|||
private class UpdateIsLatestPackageEdit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to add few comments about this class.
return database.ExecuteSqlCommandAsync(sql, parameters); | ||
} | ||
|
||
private async Task<int> UpdateIsLatestInDatabaseWithConcurrencyCheckAsync(Database database, UpdateIsLatestPackageEdit packageEdit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: line too long.
packageEdit.Package.Key, packageEdit.Package.IsLatest, packageEdit.Package.IsLatestStable, packageEdit.OriginalIsLatest, packageEdit.OriginalIsLatestStable); | ||
} | ||
|
||
private static bool PackagesMatch(Package first, Package second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe PackagesVersionMatch instead of the PackagesMatch?
} | ||
} | ||
|
||
private static void MergeUpdateIsLatestClearsWithSets(List<UpdateIsLatestPackageEdit> clears, List<UpdateIsLatestPackageEdit> sets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to have few comments. Also the line length seems to be large.
return database.ExecuteSqlCommandAsync(sql, parameters); | ||
} | ||
|
||
private async Task<int> UpdateIsLatestInDatabaseWithConcurrencyCheckAsync(Database database, UpdateIsLatestPackageEdit packageEdit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int [](start = 27, length = 3)
returning true/false will make the code more readable.
|
||
if (retryCount++ < UpdateIsLatestMaxRetries) | ||
{ | ||
await Task.Delay(retryCount * 500); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
500 [](start = 62, length = 3)
const
if (retryCount++ < UpdateIsLatestMaxRetries) | ||
{ | ||
await Task.Delay(retryCount * 500); | ||
await UpdateIsLatestAsync(packageRegistration, retryCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UpdateIsLatestAsync [](start = 38, length = 19)
this code could be simpler if the retries were done in UpdateIsLatestAsync. In this case UpdateIsLatestInDatabaseAsync calls the method that called it (UpdateIsLatestAsync), that calls it again, creating a recursion..
If UpdateIsLatestInDatabaseAsync returned a failure, UpdateIsLatestAsync can call it again until retries are exhausted.
Regarding functional tests: Scott wrote a test that pushes a bunch of packages and verified timestamps: NuGetGallery/tests/NuGetGallery.FunctionalTests/ODataFeeds/V2FeedExtendedTests.cs Line 64 in ce6c5d6
You could do something similar |
} | ||
|
||
private Task UpdateIsLatestAsync(PackageRegistration packageRegistration, int retryCount) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the retryCount used in this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, on retry of concurrency conflict in UpdateIsLatestInDatabaseAsync
|
||
private static bool PackagesMatch(Package first, Package second) | ||
{ | ||
if (first == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more concise and readable if you changed this to...
if (first == null || second == null)
{
return first != second;
}
return first.Version.Equals(second.Version, StringComparison.OrdinalIgnoreCase);
|
||
public Package Package { get; private set; } | ||
|
||
public bool OriginalIsLatest { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and OriginalIsLatestStable
should have a private set because they should not be changed after being initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These may change in MergeUpdateIsLatestClearsWithSets
@@ -678,28 +685,149 @@ private void ValidatePackageTitle(PackageMetadata packageMetadata) | |||
} | |||
} | |||
|
|||
public async Task UpdateIsLatestAsync(PackageRegistration packageRegistration, bool commitChanges = true) | |||
protected internal virtual Task<int> ExecuteSqlCommandAsync(Database database, string sql, params object[] parameters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we merge this function with UpdateIsLatestInDatabaseWithConcurrencyCheckAsync
? The code should fit inside one function just fine since we don't use it anywhere else.
@@ -470,6 +473,13 @@ private static ActionResult BadRequestForExceptionMessage(Exception ex) | |||
} | |||
|
|||
await PackageService.MarkPackageUnlistedAsync(package); | |||
|
|||
// handle in separate transaction because of concurrency check with retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalize this comment just like you did this one and also add a period to the end (if the rest of the file does that).
@@ -470,6 +473,13 @@ private static ActionResult BadRequestForExceptionMessage(Exception ex) | |||
} | |||
|
|||
await PackageService.MarkPackageUnlistedAsync(package); | |||
|
|||
// handle in separate transaction because of concurrency check with retry | |||
if (package.IsLatest || package.IsLatestStable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest putting UpdateIsLatestAsync
back inside MarkPackageListedAsync
but after the commit. You always call the two together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment for MarkPackageUnlistedAsync. While this method also supports not committing changes, only test code does it. So, I could remove that option and move UpdateIsLatestAsync back inside, but I'm inclined to keep these methods consistent. Let me know if you disagree.
/// <summary> | ||
/// Updates IsLatest/IsLatestStable flags after a package CUD operation. | ||
/// | ||
/// Database updates are applied on a separate context to better control refresh of entities when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better if this comment were more specific. We want better control BECAUSE we were having issues with multiple requests operating on the isLatest
fields of the package simultaneously and leading to multiple packages being marked as the latest or the latest prerelease.
{ | ||
return; | ||
} | ||
|
||
// TODO: improve setting the latest bit; this is horrible. Trigger maybe? | ||
// performed updates on separate context in order to refresh entities between retries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalize please.
{ | ||
EntitiesConfiguration.SuspendExecutionStrategy = false; | ||
|
||
var disposableContext = altEntitiesContext as IDisposable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move the declaration of altEntitiesContext
inside the try-finally
and then use a using
statement instead of having this in the finally
statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and you don't need to cast. DbContext
implements IDisposable
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was to support the Mock object used by tests, which wasn't IDisposable. Will see if there's another way.
packageRepository.Verify(r => r.CommitChangesAsync(), Times.Never()); | ||
} | ||
|
||
[Fact] | ||
public async Task CommitIfCommitChangesIsTrue() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest renaming this to AlwaysCommitsChanges
because the commit changes option was removed from the function.
} | ||
|
||
[Fact] | ||
public async Task WillUpdateIsLatest3() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add some unit tests that test the behavior when transactions fail, and enumerate different possible scenarios. You can do this by mocking the UpdateIsLatestInDatabaseWithConcurrencyCheckAsync
method and performing different behaviors based on the UpdateIsLatestPackageEdit
object supplied.
return new ExecutionStrategySuspension(); | ||
} | ||
|
||
private class ExecutionStrategySuspension : IDisposable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RetriableExecutionStrategySuspension
Looks great! |
{ | ||
get | ||
{ | ||
return (bool?)CallContext.LogicalGetData("SuspendExecutionStrategy") ?? false; | ||
return (bool?)CallContext.LogicalGetData("UseRetriableExecutionStrategy") ?? false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use nameof(UseRetriableExecutionStrategy)
.
Also, as part of the logic inversion, the fallback value here should be true
not false
.
@@ -581,6 +581,8 @@ public class TheEditMethod | |||
.Throws(new Exception("Shouldn't be called")); | |||
packageService.Setup(svc => svc.FindPackageByIdAndVersion("Foo", "1.0", true)) | |||
.Returns(package).Verifiable(); | |||
packageService.Setup(svc => svc.UpdateIsLatestAsync(It.IsAny<PackageRegistration>())) | |||
.Returns(Task.FromResult(0)).Verifiable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does Task.CompletedTask
also work? I think it's more clear than Task.FromResult(0)
.
@@ -1607,6 +1609,8 @@ public class TheVerifyPackageActionForPostRequests | |||
.Returns(Task.CompletedTask); | |||
fakePackageService.Setup(x => x.MarkPackageUnlistedAsync(fakePackage, false)) | |||
.Returns(Task.CompletedTask); | |||
fakePackageService.Setup(x => x.UpdateIsLatestAsync(It.IsAny<PackageRegistration>())) | |||
.Returns(Task.FromResult(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Task.CompletedTask
?
} | ||
|
||
[Fact] | ||
public async Task UpdateIsLatestAsync_SameLatestAndLatestStableVersionsWithClear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SameNewLatestAndLatestStableVersions
sounds more clear than SameLatestAndLatestStableVersionsWithClear
.
@scottbommarito look ok to you? |
WriteLine($"Package data does not contain '{propertyName}' timestamp!"); | ||
return null; | ||
} | ||
var propertiesStart = Math.Max(packageResponse.IndexOf("<m:properties>"), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use XmlDocument
or XDocument
to read the XML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that using XmlDocument
or XDocument
would also remove the potential for the regex to take an extremely long time as @blowdart noted.
|
||
var timestampStartIndex = timestampTagIndex + timestampStartTag.Length; | ||
var timestampLength = packageResponse.Substring(timestampStartIndex).IndexOf(timestampEndTag); | ||
var propertyRegEx = new Regex(GetPackagePropertyRegexPattern); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure there's a timeout on all regexs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple nitpicks
@@ -51,9 +50,9 @@ public class PackagesControllerFacts | |||
if (uploadFileService == null) | |||
{ | |||
uploadFileService = new Mock<IUploadFileService>(); | |||
uploadFileService.Setup(x => x.DeleteUploadFileAsync(It.IsAny<int>())).Returns(Task.FromResult(0)); | |||
uploadFileService.Setup(x => x.DeleteUploadFileAsync(It.IsAny<int>())).Returns(Task.CompletedTask); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for doing this...so much cleaner
indexingService.Object, | ||
packageNamingConflictValidator, | ||
auditingService); | ||
|
||
packageService.Setup(s => s.TryUpdateIsLatestInDatabase(It.IsAny<IEntitiesContext>())) | ||
.Returns(Task.FromResult(true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Task.CompletedTask
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only switched to Task.CompletedTask for async methods w/o return values. Since TryUpdateIsLatestInDatabase returns Task<bool>
, I want to return true to look like the db update was successful.
WriteLine($"Package data does not contain '{propertyName}' timestamp!"); | ||
return null; | ||
} | ||
var propertiesStart = Math.Max(packageResponse.IndexOf("<m:properties>"), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that using XmlDocument
or XDocument
would also remove the potential for the regex to take an extremely long time as @blowdart noted.
} | ||
Task.WaitAll(concurrentTasks); | ||
|
||
await CheckPackageLatestVersions(packageId, packageVersions, expectedLatest: "7.0.2-abc", expectedLatestStable: "7.0.1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer that you had these values in a variable instead of copying them here.
var latestVersion = "7.0.2-abc"
var latestStable = "7.0.1"
var latestVersionHalf = "3.0.2-abc"
var latestStableHalf = "3.0.1"
var packageVersions = new List<string>()
{
"1.0.0-a", "1.0.0-b", "1.0.0", "1.0.1", "1.0.2-abc",
"2.0.0-a", "2.0.0-b", "2.0.0", "2.0.1", "2.0.2-abc",
"3.0.0-a", "3.0.0-b", "3.0.0", latestStableHalf, latestVersionHalf,
"4.0.0-a", "4.0.0-b", "4.0.0", "4.0.1", "4.0.2-abc",
"6.0.0-a", "6.0.0-b", "6.0.0", "6.0.1", "6.0.2-abc",
"7.0.0-a", "7.0.0-b", "7.0.0", latestStable, latestVersion
};
...
await CheckPackageLatestVersions(packageId, packageVersions, latestVersion, latestStable);
...
// unlist last half and verify; ~1-2 concurrency conflicts seen in testing
for (int i = concurrentTasks.Length - 1; i > concurrentTasks.IndexOf(latestVersionHalf); i--)
...
await CheckPackageLatestVersions(packageId, packageVersions, latestVersionHalf, latestStableHalf);
// unlist remaining and verify; ~1-2 concurrency conflicts seen in testing
for (int i = concurrentTasks.IndexOf(latestVersionHalf); i >= 0; i--)
private async Task CheckPackageTimestampsInOrder(List<string> packageIds, string timestampPropertyName, | ||
DateTime operationStartTimestamp) | ||
DateTime operationStartTimestamp, bool packagesListed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that you're adding this to my test as well as yours, but this parameter is unnecessary because you can determine whether or not the packages are listed based on the feed itself. If the package has a Published
property that is greater than DateTime.MinValue
(pretty sure it's January 1st 1900) then it is listed. You can parse this and then determine whether or not it should be latest or latest stable. I tried to do this initially but didn't realize that Published
represented the listed status in the feed.
You could of course change this into a "expectListed" parameter that verifies that the listed status from the feed is identical to what we expect.
@@ -795,6 +795,7 @@ protected internal virtual IEntitiesContext CreateNewEntitiesContext() | |||
int retryCount = 1; | |||
do | |||
{ | |||
await Task.Delay(retryCount * 250); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to reduce the chance of retry collisions, I would recommend introducing randomization into the sleep duration
6d787b1
to
315179b
Compare
* Revert "UpdateIsLatest concurrent unlist fix (#3695)" This reverts commit 551fd86. * Revert "Fix concurrent push test by disabling search hijacking on feed (#3641)" This reverts commit 00fb3fb. * Revert "IsLatest Fix: wrong connection string passed to retry context (#3632)" This reverts commit 1b6b5b0. * Revert "Fix #2514 - Use optimistic concurrency to prevent multiple UpdateIsLatest calls on same package (#3548)" This reverts commit 27ea9bc.
Approach is same as discussed, but implementation is more complex due to our need to restrict the optimistic concurrency check to CUD operations only on specific columns - which EF doesn't support.