Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vcrrequest init error fix #1275

Merged
merged 7 commits into from
Oct 8, 2019
Merged

Conversation

ankagrawal
Copy link
Collaborator

This PR fixes a bug that was encountered while testing recovery. Vcr was trying to get local replicas map using the "getReplicaIds" API in HelixClusterManager. But this API has a check for being called only in a AmbryDataNode.
As a fix this PR overrides the get local replicas functionality in VcrRequests. Also Vcr nodes should be able to handle requests for all partitions. The overridden get local replicas functionality exhibits this behavior too.

@ankagrawal ankagrawal force-pushed the vcrrequest_init_error_fix branch from f68088a to c658742 Compare October 7, 2019 18:58
@codecov-io
Copy link

codecov-io commented Oct 7, 2019

Codecov Report

Merging #1275 into master will increase coverage by 0.07%.
The diff coverage is 94.11%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1275      +/-   ##
============================================
+ Coverage      72.1%   72.17%   +0.07%     
- Complexity     6234     6237       +3     
============================================
  Files           449      449              
  Lines         35737    35733       -4     
  Branches       4540     4537       -3     
============================================
+ Hits          25769    25792      +23     
+ Misses         8791     8764      -27     
  Partials       1177     1177
Impacted Files Coverage Δ Complexity Δ
...in/java/com.github.ambry.cloud/CloudBlobStore.java 82.77% <ø> (ø) 66 <0> (ø) ⬇️
...in/java/com.github.ambry.server/AmbryRequests.java 88.92% <100%> (+0.67%) 121 <1> (ø) ⬇️
.../main/java/com.github.ambry.cloud/VcrRequests.java 100% <100%> (ø) 10 <2> (+1) ⬆️
...va/com.github.ambry.cloud/CloudStorageManager.java 96.87% <90%> (+7.98%) 16 <4> (-2) ⬇️
.../src/main/java/com.github.ambry.store/Journal.java 90.76% <0%> (-1.54%) 24% <0%> (-1%)
.../java/com.github.ambry.router/DeleteOperation.java 93.28% <0%> (-1.5%) 44% <0%> (-1%)
...m.github.ambry.replication/ReplicationMetrics.java 94.92% <0%> (-1.21%) 43% <0%> (-1%)
...ain/java/com.github.ambry.router/PutOperation.java 91.21% <0%> (-0.51%) 113% <0%> (-1%)
...in/java/com.github.ambry.network/SocketServer.java 85.12% <0%> (-0.42%) 15% <0%> (ø)
...src/main/java/com.github.ambry.commons/BlobId.java 93.52% <0%> (-0.36%) 71% <0%> (-1%)
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e9b37f...12199df. Read the comment docs.

Copy link
Contributor

@lightningrob lightningrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks good. Few style comments to address.

partitionToReplica.put(partitionId, new CloudReplica(partitionId, currentNode));
}
return partitionToReplica;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can simplify this to:
return partitionIds.stream().collect(Collectors.toMap(Function.identity(), partitionId -> new CloudReplica(partitionId, currentNode)));

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

for (ReplicaId replicaId : localReplicaIds) {
partitionToReplica.put(replicaId.getPartitionId(), replicaId);
}
return partitionToReplica;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

CloudBlobStore store = null;
try {
store = new CloudBlobStore(properties, partitionId, cloudDestination, clusterMap, vcrMetrics);
store.start();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid side effects like starting a store inside a computeIfAbsent lambda.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a log after `CloudBlobStore created successfully?

@zzmao
Copy link
Contributor

zzmao commented Oct 8, 2019

One minor comment.

LGTM.

@@ -67,8 +65,8 @@ public boolean shutdownBlobStore(PartitionId id) {

@Override
public Store getStore(PartitionId id) {
Store store = partitionToStore.get(id);
return (store != null && store.isStarted()) ? store : null;
createAndStartBlobStoreIfAbsent(id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, let's consider the calls to shutdownBlobStore and removeStore in VCR replication

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a fix i have made the CloudBlobStore operations thread safe. Both removestore and shutdownstore are there as is in the add/remove replica flow. But a getstore will return null, if the store is not started.

…lication path (via add/remove paritition) as well as requests path (getStore will add store for vcr if not found).
@cgtz cgtz merged commit 19b980e into linkedin:master Oct 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants