Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable compaction by having the Zed service run "manage" every 5 minutes #3006

Merged
merged 1 commit into from
Feb 15, 2024

Conversation

philrz
Copy link
Contributor

@philrz philrz commented Feb 13, 2024

Now that brimdata/super#5017 has merged, it's easy to have Zui start leveraging the approach to periodically run compaction to improve performance.

There's a few fancy ideas we could pursue that I'm not currently advocating.

  1. As noted in Run "zed manage" behind Zui #2847, we could enable a user-configurable Setting to enable/disable whether manage is run at all, e.g., if we wanted to start with it disabled have this be an "opt in" feature at first, or if we wanted to start with it enabled but wanted to give users an easy way to turn it off. However, since compaction allows for best case lake performance, we'll generally want all users to be running it continuously. In the event a bug makes users want to opt out while we work on a fix, they could always reinstall the prior Zui release and set updates to Manual to hold off until the fix is ready.

  2. We could add a Setting to allow the user to change how often compaction runs. Since right now Zui users are not benefitting from compaction at all, it seems defensible to start with a static value like the once-every-5-minutes I'm proposing here and add such a knob in reaction to user demand.

  3. We could add a button somewhere to allow users to perform compaction immediately. This would help, for example, if they add a bunch of small commits close together and don't want to wait. However, 5 minutes doesn't seem like that long to wait. Once again, we could wait to see if any users notice and speak up.

In terms of testing, I started Zui with this branch and loaded the 26 files in the Zed sample data each as individual commits, ran the zed query -Z "from zeek@main:objects | count()" meta-query and confirmed the 26 objects, then hung around and let compaction do its thing as the configured interval. Re-running the meta-query showed the expected 1 object. I waited a while and then repeated this. The zlake.log:

$ grep -i compact zlake.log 
{"level":"info","ts":1707859133.008975,"logger":"http.access","msg":"Request completed","request_id":"2cKTB5C56jVii5CbVafsYtQq8Hr","host":"[::]:9867","method":"POST","proto":"HTTP/1.1","remote_addr":"[::1]:59603","request_content_length":-1,"url":"/pool/2cKSbE3vEZWA5SMQCkBBXtbFaMe/branch/main/compact","elapsed":1.630072071,"response_content_length":86,"status_code":200}
{"level":"info","ts":1707859133.0094552,"logger":"manage.pool","msg":"compaction completed","name":"zeek","id":"2cKSbE3vEZWA5SMQCkBBXtbFaMe","branch":"main","vectors":false,"runs_found":1,"objects_compacted":26,"vectors_created":0}
{"level":"info","ts":1707859433.026421,"logger":"manage.pool","msg":"compaction completed","name":"zeek","id":"2cKSbE3vEZWA5SMQCkBBXtbFaMe","branch":"main","vectors":false,"runs_found":0,"objects_compacted":0,"vectors_created":0}
{"level":"info","ts":1707859733.042807,"logger":"manage.pool","msg":"compaction completed","name":"zeek","id":"2cKSbE3vEZWA5SMQCkBBXtbFaMe","branch":"main","vectors":false,"runs_found":0,"objects_compacted":0,"vectors_created":0}
{"level":"info","ts":1707860033.0566049,"logger":"manage.pool","msg":"compaction completed","name":"zeek","id":"2cKSbE3vEZWA5SMQCkBBXtbFaMe","branch":"main","vectors":false,"runs_found":0,"objects_compacted":0,"vectors_created":0}
{"level":"info","ts":1707860333.072202,"logger":"manage.pool","msg":"compaction completed","name":"zeek","id":"2cKSbE3vEZWA5SMQCkBBXtbFaMe","branch":"main","vectors":false,"runs_found":0,"objects_compacted":0,"vectors_created":0}
{"level":"info","ts":1707860633.08938,"logger":"manage.pool","msg":"compaction completed","name":"zeek","id":"2cKSbE3vEZWA5SMQCkBBXtbFaMe","branch":"main","vectors":false,"runs_found":0,"objects_compacted":0,"vectors_created":0}
{"level":"info","ts":1707860933.12558,"logger":"manage.pool","msg":"compaction completed","name":"zeek","id":"2cKSbE3vEZWA5SMQCkBBXtbFaMe","branch":"main","vectors":false,"runs_found":0,"objects_compacted":0,"vectors_created":0}
{"level":"info","ts":1707861236.380692,"logger":"http.access","msg":"Request completed","request_id":"2cKXRJfnBvzi0bvAOMmddoKe63r","host":"[::]:9867","method":"POST","proto":"HTTP/1.1","remote_addr":"[::1]:59996","request_content_length":-1,"url":"/pool/2cKSbE3vEZWA5SMQCkBBXtbFaMe/branch/main/compact","elapsed":3.248688225,"response_content_length":86,"status_code":200}
{"level":"info","ts":1707861236.3817961,"logger":"manage.pool","msg":"compaction completed","name":"zeek","id":"2cKSbE3vEZWA5SMQCkBBXtbFaMe","branch":"main","vectors":false,"runs_found":1,"objects_compacted":27,"vectors_created":0}
{"level":"info","ts":1707861536.4011168,"logger":"manage.pool","msg":"compaction completed","name":"zeek","id":"2cKSbE3vEZWA5SMQCkBBXtbFaMe","branch":"main","vectors":false,"runs_found":0,"objects_compacted":0,"vectors_created":0}

Closes #2847

@philrz philrz self-assigned this Feb 13, 2024
@philrz philrz requested review from jameskerr, mattnibs and nwt February 13, 2024 22:06
Copy link
Member

@jameskerr jameskerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right on

@philrz philrz merged commit d386884 into main Feb 15, 2024
3 checks passed
@philrz philrz deleted the auto-zed-manage branch February 15, 2024 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run "zed manage" behind Zui
2 participants