Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support table compaction #930

Closed
13 of 15 tasks
v0y4g3r opened this issue Feb 2, 2023 · 5 comments
Closed
13 of 15 tasks

Support table compaction #930

v0y4g3r opened this issue Feb 2, 2023 · 5 comments
Assignees
Labels
C-enhancement Category Enhancements tracking-issue A tracking issue for a feature.
Milestone

Comments

@v0y4g3r
Copy link
Contributor

v0y4g3r commented Feb 2, 2023

What type of enhancement is this?

Performance

What does the enhancement do?

So far GreptimeDB supports flushing rows in memtable to SST files in level 0. But SST files in level 0 is not sorted in time bucket order, so retrieving rows in a given time range needs to scan all SST files in level 0. Just like other LSM tree based storage engines, we need to compact SST files in different levels to:

  • merge insert/delete records with the same primary key;
  • sort all rows in timestamp order and evenly distribute those rows to SST files in level 1 so that no SST file in level 1 contains intersecting time range;
  • delete expired SST files according to TTL;
  • we can intriduce compression and indexing tasks when compacting SST files in the future.

image

The RFC can be found at #939.

Implementation challenges

We need to implement those component:

  • compaction scheduler (along with task control to limit performance impact to foreground queries) feat: compaction scheduler and rate limiter #947;
  • compaction strategy: we can begin with file-num based strategy that trigger compaction as soon as the number of SST files in level 0 exceeds some threshold feat: L0 to L1 compaction strategy #964;
    • time bucket calculation: read time ranges of SST files in level 0 and calculate the proper time bucket for level 1 SST files so that data can be evenly distributed in level 1 SST files;
      track the referencer of SST files and only delete these SSTs when it's been marked as "deleted" and reference count is 0.
  • table compaction task: use merged reader along with dedup reader to read rows from all SST files in level 0 feat: compaction reader and writer #972.
  • integrate compaction components to datanode instance feat: compaction integration #997.

Future work

@v0y4g3r v0y4g3r added the C-enhancement Category Enhancements label Feb 2, 2023
@v0y4g3r v0y4g3r added this to the Release v0.1 milestone Feb 2, 2023
@v0y4g3r v0y4g3r self-assigned this Feb 2, 2023
@killme2008
Copy link
Contributor

we may break the snapshot read semantic when implementing compaction, since with the same sequence for snapshot read, the second read may find some rows are deleted because of compaction.

Do we have any solutions to fix this issue? Looks like it's impossible.

@waynexia
Copy link
Member

waynexia commented Feb 2, 2023

Do we have any solutions to fix this issue? Looks like it's impossible.

Maybe we need to introduce some mechanism like MVCC for it, but I think it's not necessary to do so.

@v0y4g3r
Copy link
Contributor Author

v0y4g3r commented Feb 2, 2023

we may break the snapshot read semantic when implementing compaction, since with the same sequence for snapshot read, the second read may find some rows are deleted because of compaction.

Do we have any solutions to fix this issue? Looks like it's impossible.

We need to keep track of all SnapshotImpls that currently refering level 0 SST files, and postpone the deletion of compacted level 0 SST files until all created snapshot reads are finished.

@evenyag
Copy link
Contributor

evenyag commented Feb 2, 2023

we may break the snapshot read semantic when implementing compaction, since with the same sequence for snapshot read, the second read may find some rows are deleted because of compaction.

I think this might not be a problem. Now we only support reading the latest data. In this case, we acquire a reference to a stable Version struct which references the old SSTs. But we need to make the file metadata reference counted and delete the file after no "read" referencing it.

Do we have any solutions to fix this issue?

Introduces snapshot like other dbs if we need a stable snapshot but also want to release the unused SSTs.

@evenyag evenyag added the tracking-issue A tracking issue for a feature. label Feb 14, 2023
@v0y4g3r v0y4g3r modified the milestones: v0.3, v0.1 Feb 15, 2023
@v0y4g3r v0y4g3r mentioned this issue Feb 17, 2023
2 tasks
@killme2008
Copy link
Contributor

I think we can close this issue. Open a new issue when we want to support other compaction features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category Enhancements tracking-issue A tracking issue for a feature.
Projects
None yet
Development

No branches or pull requests

4 participants