Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TabletMergeability updates and queries #5284

Merged
merged 10 commits into from
Jan 31, 2025

Conversation

cshannon
Copy link
Contributor

This is part 2 of the API changes for #5014 and is a follow on to #5187

putSplits() now supports updating existing tablets with the new TabletMergeability setting, as well as creating new splits. TabletMergeabilityInfo has been added to TabletInformation so a user can retrieve the TabletMergeability setting on a tablet as well as the insertion time and if the tablet is mergeable or not based on the current manager time.

A couple new RPCs were added to the Manager service including an RPC to update existing tablets with new TabletMergeability and to retrieve the current Manager time.

This is part 2 of the API changes for apache#5014 and is a follow on to apache#5187

putSplits() now supports updating existing tablets with the new
TabletMergeability setting, as well as creating new splits.
TabletMergeabilityInfo has been added to TabletInformation so a user can
retrieve the TabletMergeability setting on a tablet as well as the
insertion time and if the tablet is mergeable or not based on the
current manager time.

A couple new RPCs were added to the Manager service including an RPC to
update existing tablets with new TabletMergeability and to retrieve the
current Manager time.
@cshannon cshannon added this to the 4.0.0 milestone Jan 25, 2025
@cshannon cshannon requested a review from keith-turner January 25, 2025 17:40
@cshannon cshannon self-assigned this Jan 25, 2025
@cshannon
Copy link
Contributor Author

@keith-turner - I had a couple follow up questions/thoughts on the changes here:

  1. I believe most of the code assumes the TabletMergeability column always exists, but I'm not sure if we should be handling it missing. I don't recall if we talked about adding it on upgrade or not if not set.
  2. Should we allow setting a negative delay? A negative delay is confusing but would have the side effect of allowing a tablet to merge for a period of time (say the next 2 days) and then wouldn't allow it after that. I don't know that there is a use case for that but it could be supported because we are using an explicit "never" flag for the metadata and in the RPC calls so "never" isn't tied to -1 or a negative delay.
  3. Do you have any ideas on how to better test the API for retrieving TabletMergeabilityInfo on errors? The logic will retry splits if there are errors but I couldn't think of a great way to test it, at least with integration tests. Maybe we need something with mocking.
  4. I used a supplier to delay retrieving the current system time (it's shared across all the tablets that are part of the getTabletInformation() query) in case the information isn't required and to delay the current time until as late as possible and then it will be cached. We could get rid of this and just compute it and set it on retrieval.

@cshannon
Copy link
Contributor Author

cshannon commented Jan 25, 2025

I'm seeing the exact same issue with Spotbugs as described here #5256 (comment)

I opened #5285 to fix it, for now I cherry-picked the commit into this branch

GitHub actions has been running out of memory with some PRs during
SpotBugs execution and this will bump member to prevent that.
Copy link
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe most of the code assumes the TabletMergeability column always exists, but I'm not sure if we should be handling it missing. I don't recall if we talked about adding it on upgrade or not if not set.

I like assuming it always exists and resolving things at upgrade. Could mark everything as never at upgrade since we do not know. In the past when things were not resolved at upgrade and conditional logic was left in the code for old versions of Accumulo it caused really hard to test tech debt to accumulate.

Should we allow setting a negative delay? A negative delay is confusing but would have the side effect of allowing a tablet to merge for a period of time (say the next 2 days) and then wouldn't allow it after that. I don't know that there is a use case for that but it could be supported because we are using an explicit "never" flag for the metadata and in the RPC calls so "never" isn't tied to -1 or a negative delay.

I like the explicit never flag. Not sure about the negative delay. For something like that its easier to disallow it now and allow it later. Its harder to allow something and then disallow it later.

Do you have any ideas on how to better test the API for retrieving TabletMergeabilityInfo on errors? The logic will retry splits if there are errors but I couldn't think of a great way to test it, at least with integration tests. Maybe we need something with mocking.

Its probabilistic, but could create 10 threads that each add 100 splits to a table. Each thread adds some splits that are unique to it and some that are common to all threads. This should cause colllisions and retries in the code. After all the threads are done should be able to see an expected set of splits and mergabilites.

I used a supplier to delay retrieving the current system time (it's shared across all the tablets that are part of the getTabletInformation() query) in case the information isn't required and to delay the current time until as late as possible and then it will be cached. We could get rid of this and just compute it and set it on retrieval.

I like that model because it avoids an RPC per tablet. I made a suggestion kinda related to this for the API. For the API the javadoc can state its an estimate.

Comment on lines 57 to 74
/**
* Returns the time of the Manager when the TabletMergeability was set on the tablet. This will be
* empty if TabletMergeability is set to never
*
* @return the insertion time is set or else empty
*/
public Optional<Duration> getInsertionTime() {
return insertionTime;
}

/**
* Returns the current time of the Manager
*
* @return the current time
*/
public Duration getCurrentTime() {
return currentTime.get();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understanding what getInsertionTime() and getCurrentTime() mean would require a lot of knowledge of Accumulo internals. Tried to come up with something that requires less understanding and came up with the following, but not sure about the name of the method.

Suggested change
/**
* Returns the time of the Manager when the TabletMergeability was set on the tablet. This will be
* empty if TabletMergeability is set to never
*
* @return the insertion time is set or else empty
*/
public Optional<Duration> getInsertionTime() {
return insertionTime;
}
/**
* Returns the current time of the Manager
*
* @return the current time
*/
public Duration getCurrentTime() {
return currentTime.get();
}
/**
* If the tabletTabletMergeability is configured with a delay returns an estimate of the elapsed
* time since the delay was initially set. Returns optional.empty() if the tablet was configured
* with a TabletMergability of never.
*/
public Optional<Duration> elapsed() {
// its possible a the "current time" is read from the manager and then a tablets insertion time
// is read much later and this could cause a negative read, in this case set the elapsed time to zero.
// Avoiding this would require an RPC per a call to this method to get the latest manager time, so
// since this is an estimate return zero for this case.
return insertionTime.map(it -> currentTime.get().minus(it))
.map(elapsed -> elapsed.isNegative() ? Duration.ZERO : elapsed);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this method as I think having the elapsed time makes a lot of sense (I called getElapsed()) and I think the negative check is fine as a just in case to prevent issues, but I'm not sure it would actually happen in practice. The current time should be after the insertion time if the tablet existed and was in metadata when the user called getTabletInformation() so I'm not sure in what scenario it would actually be possible for the insertion time to be after the current time.

Copy link
Contributor

@keith-turner keith-turner Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario is based on changes happening during the scan plus the memoized supplier. Something like the following could happen.

  1. Thread_1 starts reading tablet info objects
  2. Thread_1 reads TabletMergeabilityInfo for tablet_X. This causes the memoized current steady time supplier to be set to 5.
  3. Thread_2 updates TabletMergeability for tablet_Y. It sets the insertion steady time for the update to 7.
  4. Thread_1 reads TabletMergeabilityInfo for tablet_Y. To compute elapsed it does 5-7=-2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably be good to work the example into the comment, maybe replacing most of the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that makes sense now, the confusion for me was I was thinking getTabletInformation() was going to load all the tablets first before returning, so the memoized current time would not be loaded until after that. But that method returns a stream and is backed by a scanner so your scenario is quite possible where there are updates during the scan that could end up causing that. I think it makes sense to work the example into the comment.

core/src/main/thrift/manager.thrift Outdated Show resolved Hide resolved
@cshannon
Copy link
Contributor Author

I believe most of the code assumes the TabletMergeability column always exists, but I'm not sure if we should be handling it missing. I don't recall if we talked about adding it on upgrade or not if not set.

I like assuming it always exists and resolving things at upgrade. Could mark everything as never at upgrade since we do not know. In the past when things were not resolved at upgrade and conditional logic was left in the code for old versions of Accumulo it caused really hard to test tech debt to accumulate.

Ok that makes sense, I went back and looked at my last PR and one thing I did do was set the default in the TabletMetadata builder to NEVER which is the same pattern that TabletAvailability uses (sets to ONDEMAND). We can still handle it on upgrade but at least if it was missing it would set a default and not fail.

Do you have any ideas on how to better test the API for retrieving TabletMergeabilityInfo on errors? The logic will retry splits if there are errors but I couldn't think of a great way to test it, at least with integration tests. Maybe we need something with mocking.

Its probabilistic, but could create 10 threads that each add 100 splits to a table. Each thread adds some splits that are unique to it and some that are common to all threads. This should cause colllisions and retries in the code. After all the threads are done should be able to see an expected set of splits and mergabilites.

That is a good idea I'll see what I can come up with for a test.

@@ -616,6 +620,54 @@ public void requestTabletHosting(TInfo tinfo, TCredentials credentials, String t
manager.hostOndemand(Lists.transform(extents, KeyExtent::fromThrift));
}

@Override
public List<TKeyExtent> updateTabletMergeability(TInfo tinfo, TCredentials credentials,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be good to return early if splits is empty within this method. I see that the calling code (TableOperationsImpl) also does this check and avoids calling updateTabletMergeability() if splits is empty but it might be good to add it here too so we don't have to rely on the calling code to do that in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way it is now is fine because it already handles the empty case pretty well. As you mentioned the calling code shouldn't ever hit the method because the client checks it, but if for some reason that changed and an empty map was submitted the server side I think it's fine because If splits are empty then the for loop won't do anything and no conditional mutations will be submitted or added so it just returns without doing anything.

@cshannon
Copy link
Contributor Author

@keith-turner - Can you take one more quick peak before I merge? I pushed up a new getRemaining() method

@Test
public void testAlways() {
var tmi = new TabletMergeabilityInfo(TabletMergeability.always(),
Optional.of(Duration.ofDays(1)), () -> Duration.ofDays(10));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The supplier for steady time actually makes testing easier. No need to mock a client context or something like that.

Optional.of(Duration.ofDays(1)), () -> Duration.ofDays(10));
assertTrue(tmi.isMergeable());
assertTrue(tmi.getTabletMergeability().isAlways());
assertTrue(tmi.getElapsed().orElseThrow().toNanos() > 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this could be more specific or if that does not make sense because its always.

Suggested change
assertTrue(tmi.getElapsed().orElseThrow().toNanos() > 0);
assertEquals(Duration.ofDays(9), tmi.getElapsed().orElseThrow());

assertEquals(Duration.ofDays(1), tmi.getElapsed().orElseThrow());
assertEquals(tmi.getDelay().map(delay -> delay.minus(tmi.getElapsed().orElseThrow())),
tmi.getRemaining());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test inseration time being after current time.

Suggested change
// test insertion time after current steady time
var tmi3 = new TabletMergeabilityInfo(TabletMergeability.after(Duration.ofDays(2)),
Optional.of(Duration.ofDays(11)), () -> Duration.ofDays(10));
assertFalse(tmi3.isMergeable());
assertEquals(Duration.ofDays(2), tmi3.getDelay().orElseThrow());
assertEquals(Duration.ZERO, tmi3.getElapsed().orElseThrow());
assertEquals(Duration.ofDays(2), tmi3.getRemaining().orElseThrow());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea to add a test for this

@cshannon cshannon merged commit daed0fd into apache:main Jan 31, 2025
8 checks passed
@cshannon cshannon deleted the accumulo-5014-api2 branch January 31, 2025 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants