-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds some missing FATE functionality #5263
Adds some missing FATE functionality #5263
Conversation
- Previously, if a user configured fewer FATE threads to work on transactions (via MANAGER_FATE_THREADPOOL_SIZE property), the pool size would not actually be decreased. These changes safely stop excess workers in the case where the property value is decreased. - Added test FatePoolResizeIT (tests both META and USER transactions: MetaFatePoolResizeIT and UserFatePoolResizeIT) to ensure the pool size is correctly increased and decreased with configuration changes. - Fate.shutdown() was not waiting on termination of the fate pool watcher when needed. Added the missing wait.
final int configured = conf.getCount(Property.MANAGER_FATE_THREADPOOL_SIZE); | ||
final int needed = configured - pool.getActiveCount(); | ||
final int needed = configured - transactionExecutor.getActiveCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAybe instead of using the active count could use the size of the list and do cleanup on it beforehand if the exitting was added.
final int needed = configured - transactionExecutor.getActiveCount(); | |
// remove any threads the have exited | |
runningTxRunners.removeIf(tr->tr.isExited()); | |
final int needed = configured - runningTxRunners.size(); |
Also wondering about creating a cached thread pool (see Executors.newCachedThreadPool() ) and not bothering resizing the pool, just add and stop stuff to the pool and it will create threads as needed. That may be too much a change for this PR though, but it would help simplify the code by removing the amount things that neede be kept in synch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good idea.
Re the cached thread pool: this may be a good solution (I'm not sure about the implications of this yet), however would require quite a bit of refactoring, and wasn't the original intention of this PR. If that is a route we want to take, we could close this PR. This would also mean #5130 changes would be different. I assume each set of fate ops may have their own cached thread pool. Or this may make 5130 completely obsolete: all fate ops would just share one cached thread pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely want to keep this PR, using a cached thread pool would not solve the problems this PR is solving. Using the cached thread pool would be complimentary to theses changes and may make the overall code simpler, but not sure. Was thinking we could only rely on the new runningTxRunners
list added in this PR and not have to worry about resizing the thread pool if using a cached thread pool. Don't worry about using it in this PR.
The cached thread pool could simplify #5130 also as you mentioned. I think the changes in this PR would still be built on though on #5130.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I was misinterpreting the impact this change would have. This probably wouldn't be much refactoring. May be as simple as changing it to creating the cached pool instead of a fixed pool and deleting the ThreadPools.resizePool(pool, conf, Property.MANAGER_FATE_THREADPOOL_SIZE);
call in the pool watcher task. In that case, it could be changed in this PR. I may be overlooking something though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing I was uncertain about w/ using a cached thread pool was how that would work w/ our utility code for creating thread pool. If large changes are required for the utility code then would not want to make that change in the PR. If not then it would probably be good to make the change in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will see about changing the thread pools util code to support this change.
I agree that is not a good way to go. This is a good find definitely want to be able to adjust the number of thread up and down. |
- Implemented better way to stop excess runners - This allowed code to be simpified: was able to decouple exited and stop, no longer wait for a runner to stop Co-authored-by: Keith Turner <kturner@apache.org>
new tests were calling a method which has since been moved in main
- removed 'exited' which is no longer used in TransactionRunner - runningTxRunners from arraylist -> hashset
Since this is approved, I'll merge this in. Regarding the cached thread pool, I'll make note about seeing about changing this for #5130 or maybe as another small PR |
Adds some missing FATE functionality
Another potential solution to this problem is disallowing decreasing the pool size, but I'm not sure if that's the way we want to go about this.
Confirmation/changes to the thread safety of the new code would be appreciated. I would also be curious if someone had a better idea of how we can go about stopping the transaction runners. Currently just try to stop a random one each second.
This is another prereq for #5130.
This also addresses the test mentioned in #5171.