Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(memtable): refactor code for memtable flush #1866

Merged
merged 1 commit into from
Mar 3, 2023

Conversation

mangalaman93
Copy link
Member

@mangalaman93 mangalaman93 commented Feb 14, 2023

Removes the flushTask struct that was used in two different context before. Once place when we were sending the memtable for flushing it onto the disk after it becomes full, and other context is when we were sending the memtable to be written onto disk during flush.

@coveralls
Copy link

coveralls commented Feb 14, 2023

Coverage Status

Coverage: 41.75% (+0.5%) from 41.229% when pulling 72a81ff on aman/memtable-flush into 4a3b224 on release/v4.0.

@mangalaman93 mangalaman93 changed the base branch from release/v4.0 to aman/linter February 14, 2023 14:28
@mangalaman93 mangalaman93 force-pushed the aman/linter branch 2 times, most recently from 4b5225f to 839781e Compare February 15, 2023 13:54
Base automatically changed from aman/linter to release/v4.0 February 17, 2023 13:03
@mangalaman93
Copy link
Member Author

These are the places where we are sending the memtable to the flushchan channel:

  1. when we open the db and find existing memtables https://github.com/dgraph-io/badger/blob/release/v4.0/db.go#L358
  2. when we close the db https://github.com/dgraph-io/badger/blob/origin/v4.0/db.go#L576
  3. when it is full https://github.com/dgraph-io/badger/blob/release/v4.0/db.go#L986

The only place this code will be useful when we open and close the DB. If the DB is closed and opened a few times in short duration, it may end up collecting a few small sized memtables and will require compaction in L0 -> L1 (or higher) unnecessarily. This will increase write amplification unnecessarily. I think this code only helps us, doesn't harm us in any way.

@mangalaman93 mangalaman93 force-pushed the aman/memtable-flush branch 5 times, most recently from 7e82b6c to b0dbb49 Compare February 21, 2023 06:14
@mangalaman93 mangalaman93 changed the title combine memtables before flushing to L0 chore(memtable): refactor logic for memtable flush Feb 21, 2023
@mangalaman93 mangalaman93 force-pushed the aman/memtable-flush branch 3 times, most recently from 464b124 to ff60a72 Compare February 21, 2023 06:29
harshil-goel
harshil-goel previously approved these changes Feb 22, 2023
@mangalaman93 mangalaman93 changed the base branch from main-deprecated-v4 to main March 1, 2023 07:13
@mangalaman93 mangalaman93 force-pushed the aman/memtable-flush branch from 72a81ff to 248292f Compare March 1, 2023 07:15
Copy link
Contributor

@joshua-goldstein joshua-goldstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • CI is failing on TestPenultimateMemCorruption (link)
  • Please add a description to the PR explaining why this change is useful / necessary (I see your comment makes some of this clear, but it should also be in the description)
  • This change was based on changes that were in another PR in Badger, this should be referenced
  • We should benchmark the performance benefits of this change. Currently our assumption is this won't do any harm and may have some benefits when opening/closing the DB, but we should be able to quantify this benefit. Adding a benchmark would also be useful for future changes.

@joshua-goldstein
Copy link
Contributor

Looks like it is based off #1696 - is this change necessary to bring in the handover skiplist feature?

@mangalaman93
Copy link
Member Author

mangalaman93 commented Mar 1, 2023

  • CI is failing on TestPenultimateMemCorruption (link)

=> looking at the failure

* Please add a description to the PR explaining why this change is useful / necessary (I see your comment makes some of this clear, but it should also be in the description)

=> updated the description

* This change was based on changes that were in another PR in Badger, this should be referenced

=> It is not anymore. Earlier it was using some changes from another commit but I realized that we don't want those changes either because it was doubling the size of L0 sstables to twice of the expected size.

* We should benchmark the performance benefits of this change. Currently our assumption is this won't do any harm and may have some benefits when opening/closing the DB, but we should be able to quantify this benefit.  Adding a benchmark would also be useful for future changes.

=> We are only making structural changes to the code, the logic is the same as it was before. I don't expect the performance to change at all.

@mangalaman93 mangalaman93 changed the title chore(memtable): refactor logic for memtable flush chore(memtable): remove flushtask struct Mar 1, 2023
@mangalaman93
Copy link
Member Author

This change is not based off of #1696 any more. That is how the change started but I have removed the code to club together memtables.

@mangalaman93 mangalaman93 force-pushed the aman/memtable-flush branch from 248292f to 9ea992a Compare March 2, 2023 07:27
@mangalaman93 mangalaman93 changed the title chore(memtable): remove flushtask struct chore(memtable): refactor code for memtable flush Mar 2, 2023
@mangalaman93 mangalaman93 merged commit 94d6168 into main Mar 3, 2023
@mangalaman93 mangalaman93 deleted the aman/memtable-flush branch March 3, 2023 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants