Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change defaults on bulk import from unlimited. #5138

Closed
keith-turner opened this issue Dec 4, 2024 · 1 comment · Fixed by #5149
Closed

Change defaults on bulk import from unlimited. #5138

keith-turner opened this issue Dec 4, 2024 · 1 comment · Fixed by #5149
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Milestone

Comments

@keith-turner
Copy link
Contributor

keith-turner commented Dec 4, 2024

Is your feature request related to a problem? Please describe.

Bulk imports that continually add files to a tablet that already has a lot of files will eventually cause multiple problems.

Describe the solution you'd like

#5104 adds limits that can prevent this situation from happening, but those default to unlimited. There is also an existing property table.bulk.max.tablets that was introduced in 2.1.0 that defaults to unlimited. After #5104 is merged could change those defaults to a limit and also change the existing property table.bulk.max.tablets to also have a default limit. Could set the limit of all these properties to 100 to start with. When a user encounters the limit they can adjust as needed. This will make the default behavior of the system more sustainable.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

@keith-turner keith-turner added the enhancement This issue describes a new feature, improvement, or optimization. label Dec 4, 2024
@keith-turner keith-turner added this to the 4.0.0 milestone Dec 4, 2024
@keith-turner
Copy link
Contributor Author

keith-turner commented Dec 4, 2024

#5117 (comment) this is an example of one problem that could happen w/ too many files in tablet. Most Accumulo code assumes that all tablet metadata can be read into memory. When a tablet has 25K files and many threads reading that tablets metadata into memory it can cause a lot of memory pressure. There are other problems too many files can cause for merge, split, scan, and compaction also.

keith-turner added a commit to keith-turner/accumulo that referenced this issue Dec 7, 2024
ddanielr pushed a commit that referenced this issue Dec 9, 2024
* sets default limits on bulk imports files

fixes #5138

* fix property description
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant