Enable compaction by having the Zed service run "manage" every 5 minutes #3006
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Now that brimdata/super#5017 has merged, it's easy to have Zui start leveraging the approach to periodically run compaction to improve performance.
There's a few fancy ideas we could pursue that I'm not currently advocating.
As noted in Run "zed manage" behind Zui #2847, we could enable a user-configurable Setting to enable/disable whether
manage
is run at all, e.g., if we wanted to start with it disabled have this be an "opt in" feature at first, or if we wanted to start with it enabled but wanted to give users an easy way to turn it off. However, since compaction allows for best case lake performance, we'll generally want all users to be running it continuously. In the event a bug makes users want to opt out while we work on a fix, they could always reinstall the prior Zui release and set updates to Manual to hold off until the fix is ready.We could add a Setting to allow the user to change how often compaction runs. Since right now Zui users are not benefitting from compaction at all, it seems defensible to start with a static value like the once-every-5-minutes I'm proposing here and add such a knob in reaction to user demand.
We could add a button somewhere to allow users to perform compaction immediately. This would help, for example, if they add a bunch of small commits close together and don't want to wait. However, 5 minutes doesn't seem like that long to wait. Once again, we could wait to see if any users notice and speak up.
In terms of testing, I started Zui with this branch and loaded the 26 files in the Zed sample data each as individual commits, ran the
zed query -Z "from zeek@main:objects | count()"
meta-query and confirmed the 26 objects, then hung around and let compaction do its thing as the configured interval. Re-running the meta-query showed the expected 1 object. I waited a while and then repeated this. Thezlake.log
:Closes #2847