-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compress/Encrypt Blocks in background #1227
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami and @manishrjain)
db2_test.go, line 678 at r1 (raw file):
db1.valueDirGuard = nil } require.NoError(t, db1.Close())
This is a small cleanup which is not related to this PR.
table/builder.go, line 359 at r1 (raw file):
start := uint32(0) for i, bl := range b.blockList { // Length of the block is start minues the end.
minues => minus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ready for review? @jarifibrahim
Reviewable status: 0 of 8 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami and @manishrjain)
b905f10
to
10e0c2b
Compare
This commit adds support for compression/encryption in the background. All incoming writes are saved to a common buffer and then worker go routine pick up each block (a block is just a start and end point in the main buffer). The worker goroutines also fix the table start and end points. We fix the interleaving space (which might be caused because the block was compressed) in the builder.Finish() call. All the blocks are copied over to their correct position. Master branch vs This commit (new) Benchmarks. ``` benchstat master.txt new.txt name old time/op new time/op delta (-ve is better. Reduction in time) no_compression-16 177ms ± 1% 177ms ± 2% ~ (p=0.690 n=5+5) encryption-16 313ms ± 4% 245ms ± 3% -21.69% (p=0.008 n=5+5) zstd_compression/level_1-16 330ms ± 1% 227ms ± 2% -31.22% (p=0.008 n=5+5) zstd_compression/level_3-16 345ms ± 2% 227ms ± 3% -34.39% (p=0.008 n=5+5) zstd_compression/level_15-16 10.7s ± 1% 1.2s ± 0% -88.96% (p=0.008 n=5+5) name old speed new speed delta (+ve is better. Speed improvement) no_compression-16 471MB/s ± 1% 471MB/s ± 2% ~ (p=0.690 n=5+5) encryption-16 266MB/s ± 4% 340MB/s ± 2% +27.67% (p=0.008 n=5+5) zstd_compression/level_1-16 252MB/s ± 1% 367MB/s ± 2% +45.40% (p=0.008 n=5+5) zstd_compression/level_3-16 241MB/s ± 2% 367MB/s ± 3% +52.44% (p=0.008 n=5+5) zstd_compression/level_15-16 7.80MB/s ± 1% 70.62MB/s ± 0% +805.33% (p=0.008 n=5+5) ``` Please look at #1227 (comment) for the code used for benchmarking. (cherry picked from commit b13b927)
This reverts commit b13b927.
This reverts commit b13b927.
This reverts commit b13b927.
This reverts commit b13b927. Conflicts: options.go table/builder.go
This reverts commit b13b927. This commit is being reverted because we have seen some crashes which could be caused by it. We haven't been able to reproduce the crashes yet. Related to #1389, #1388, #1387 Also, see https://discuss.dgraph.io/t/current-state-of-badger-crashes/7602 This commit had some conflicts. See PR description for details.
Also, disable conflict detection in badger to save memory. ``` 0dfb8b4 Changelog for v20.07.0 (dgraph-io/badger#1411) 03ba278 Add missing changelog for v2.0.3 (dgraph-io/badger#1410) 6001230 Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) 800305e Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) 63d9309 Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) e0d058c Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) d981f47 return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) 7f4e4b5 add assert to check integer overflow for table size (dgraph-io/badger#1402) 8e896a7 Add a contribution guide (dgraph-io/badger#1379) b79aeef Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) 717b89c Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) 09dfa66 Update ristretto to commit f66de99 (dgraph-io/badger#1391) 509de73 Update head while replaying value log (dgraph-io/badger#1372) e013bfd Rework DB.DropPrefix (dgraph-io/badger#1381) 3042e37 pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) 675efcd Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) 158d927 Remove second initialization of writech in Open (dgraph-io/badger#1382) d37ce36 Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) 3f4761d Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) dd332b0 Avoid panic in filltables() (dgraph-io/badger#1365) c45d966 Fix assert in background compression and encryption. (dgraph-io/badger#1366) ```
This commit brings the following new changes from badger. This commit also disable conflict detection in badger to save memory. ``` Fix assert in background compression and encryption. (dgraph-io/badger#1366) Avoid panic in filltables() (dgraph-io/badger#1365) Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) Remove second initialization of writech in Open (dgraph-io/badger#1382) Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) Rework DB.DropPrefix (dgraph-io/badger#1381) Update head while replaying value log (dgraph-io/badger#1372) Update ristretto to commit f66de99 (dgraph-io/badger#1391) Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) Add a contribution guide (dgraph-io/badger#1379) add assert to check integer overflow for table size (dgraph-io/badger#1402) return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) Add missing changelog for v2.0.3 (dgraph-io/badger#1410) Changelog for v20.07.0 (dgraph-io/badger#1411) ```
This commit brings following new changes from badger This commit also disable conflict detection in badger to save memory. ``` Fix assert in background compression and encryption. (dgraph-io/badger#1366) Avoid panic in filltables() (dgraph-io/badger#1365) Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) Remove second initialization of writech in Open (dgraph-io/badger#1382) Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) Rework DB.DropPrefix (dgraph-io/badger#1381) Update head while replaying value log (dgraph-io/badger#1372) Update ristretto to commit f66de99 (dgraph-io/badger#1391) Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) Add a contribution guide (dgraph-io/badger#1379) add assert to check integer overflow for table size (dgraph-io/badger#1402) return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) Add missing changelog for v2.0.3 (dgraph-io/badger#1410) Changelog for v20.07.0 (dgraph-io/badger#1411) ```
This commit brings following new changes from badger This commit also disable conflict detection in badger to save memory. ``` Fix assert in background compression and encryption. (dgraph-io/badger#1366) Avoid panic in filltables() (dgraph-io/badger#1365) Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) Remove second initialization of writech in Open (dgraph-io/badger#1382) Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) Rework DB.DropPrefix (dgraph-io/badger#1381) Update head while replaying value log (dgraph-io/badger#1372) Update ristretto to commit f66de99 (dgraph-io/badger#1391) Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) Add a contribution guide (dgraph-io/badger#1379) add assert to check integer overflow for table size (dgraph-io/badger#1402) return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) Add missing changelog for v2.0.3 (dgraph-io/badger#1410) Changelog for v20.07.0 (dgraph-io/badger#1411) ```
This commit brings following new changes from badger This commit also disable conflict detection in badger to save memory. ``` Fix assert in background compression and encryption. (dgraph-io/badger#1366) Avoid panic in filltables() (dgraph-io/badger#1365) Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) Remove second initialization of writech in Open (dgraph-io/badger#1382) Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) Rework DB.DropPrefix (dgraph-io/badger#1381) Update head while replaying value log (dgraph-io/badger#1372) Update ristretto to commit f66de99 (dgraph-io/badger#1391) Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) Add a contribution guide (dgraph-io/badger#1379) add assert to check integer overflow for table size (dgraph-io/badger#1402) return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) Add missing changelog for v2.0.3 (dgraph-io/badger#1410) Changelog for v20.07.0 (dgraph-io/badger#1411) ```
* Update badger to v20.07.0-rc1 This commit brings following new changes from badger This commit also disable conflict detection in badger to save memory. ``` Fix assert in background compression and encryption. (dgraph-io/badger#1366) Avoid panic in filltables() (dgraph-io/badger#1365) Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) Remove second initialization of writech in Open (dgraph-io/badger#1382) Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) Rework DB.DropPrefix (dgraph-io/badger#1381) Update head while replaying value log (dgraph-io/badger#1372) Update ristretto to commit f66de99 (dgraph-io/badger#1391) Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) Add a contribution guide (dgraph-io/badger#1379) add assert to check integer overflow for table size (dgraph-io/badger#1402) return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) Add missing changelog for v2.0.3 (dgraph-io/badger#1410) Changelog for v20.07.0 (dgraph-io/badger#1411) ``` * Remove unnecessary detect conflicts
* Update badger to v20.07.0-rc1 This commit brings the following new changes from badger. This commit also disable conflict detection in badger to save memory. ``` Fix assert in background compression and encryption. (dgraph-io/badger#1366) Avoid panic in filltables() (dgraph-io/badger#1365) Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) Remove second initialization of writech in Open (dgraph-io/badger#1382) Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) Rework DB.DropPrefix (dgraph-io/badger#1381) Update head while replaying value log (dgraph-io/badger#1372) Update ristretto to commit f66de99 (dgraph-io/badger#1391) Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) Add a contribution guide (dgraph-io/badger#1379) add assert to check integer overflow for table size (dgraph-io/badger#1402) return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) Add missing changelog for v2.0.3 (dgraph-io/badger#1410) Changelog for v20.07.0 (dgraph-io/badger#1411) ``` * Remove unnecessary detect conflicts
* Update badger to v20.07.0-rc1 This commit brings following new changes from badger This commit also disable conflict detection in badger to save memory. ``` Fix assert in background compression and encryption. (dgraph-io/badger#1366) Avoid panic in filltables() (dgraph-io/badger#1365) Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) Remove second initialization of writech in Open (dgraph-io/badger#1382) Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) Rework DB.DropPrefix (dgraph-io/badger#1381) Update head while replaying value log (dgraph-io/badger#1372) Update ristretto to commit f66de99 (dgraph-io/badger#1391) Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) Add a contribution guide (dgraph-io/badger#1379) add assert to check integer overflow for table size (dgraph-io/badger#1402) return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) Add missing changelog for v2.0.3 (dgraph-io/badger#1410) Changelog for v20.07.0 (dgraph-io/badger#1411) ``` * Remove unnecessary detect conflicts Co-authored-by: parasssh <paras@dgraph.io>
* Update badger to v20.0.7-rc1 This commit brings following new changes from badger This commit also disable conflict detection in badger to save memory. ``` Fix assert in background compression and encryption. (dgraph-io/badger#1366) Avoid panic in filltables() (dgraph-io/badger#1365) Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) Remove second initialization of writech in Open (dgraph-io/badger#1382) Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) Rework DB.DropPrefix (dgraph-io/badger#1381) Update head while replaying value log (dgraph-io/badger#1372) Update ristretto to commit f66de99 (dgraph-io/badger#1391) Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) Add a contribution guide (dgraph-io/badger#1379) add assert to check integer overflow for table size (dgraph-io/badger#1402) return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) Add missing changelog for v2.0.3 (dgraph-io/badger#1410) Changelog for v20.07.0 (dgraph-io/badger#1411) ``` * Remove unnecessary detect conflicts Co-authored-by: parasssh <paras@dgraph.io>
* Update badger to v20.07.0-rc1 This commit brings the following new changes from badger. This commit also disable conflict detection in badger to save memory. ``` Fix assert in background compression and encryption. (dgraph-io/badger#1366) Avoid panic in filltables() (dgraph-io/badger#1365) Force KeepL0InMemory to be true when InMemory is true (dgraph-io/badger#1375) Tests: Use t.Parallel in TestIteratePrefix tests (dgraph-io/badger#1377) Remove second initialization of writech in Open (dgraph-io/badger#1382) Increase default valueThreshold from 32B to 1KB (dgraph-io/badger#1346) pre allocate cache key for the block cache and the bloom filter cache (dgraph-io/badger#1371) Rework DB.DropPrefix (dgraph-io/badger#1381) Update head while replaying value log (dgraph-io/badger#1372) Update ristretto to commit f66de99 (dgraph-io/badger#1391) Enable cross-compiled 32bit tests on TravisCI (dgraph-io/badger#1392) Avoid panic on multiple closer.Signal calls (dgraph-io/badger#1401) Add a contribution guide (dgraph-io/badger#1379) add assert to check integer overflow for table size (dgraph-io/badger#1402) return error if the vlog writes exceeds more that 4GB. (dgraph-io/badger#1400) Revert "add assert to check integer overflow for table size (dgraph-io/badger#1402)" (dgraph-io/badger#1406) Revert "fix: Fix race condition in block.incRef (dgraph-io/badger#1337)" (dgraph-io/badger#1407) Revert "Buffer pool for decompression (dgraph-io/badger#1308)" (dgraph-io/badger#1408) Revert "Compress/Encrypt Blocks in the background (dgraph-io/badger#1227)" (dgraph-io/badger#1409) Add missing changelog for v2.0.3 (dgraph-io/badger#1410) Changelog for v20.07.0 (dgraph-io/badger#1411) ``` * Remove unnecessary detect conflicts
This reverts commit b13b927. This commit is being reverted because we have seen some crashes which could be caused by it. We haven't been able to reproduce the crashes yet. Related to #1389, #1388, #1387 Also, see https://discuss.dgraph.io/t/current-state-of-badger-crashes/7602 This commit had some conflicts. See PR description for details.
This commit adds support for compression/encryption in the background. All incoming writes are saved to a common buffer and then worker go routine pick up each block (a block is just a start and end point in the main buffer). The worker goroutines also fix the table start and end points. We fix the interleaving space (which might be caused because the block was compressed) in the builder.Finish() call. All the blocks are copied over to their correct position. Master branch vs This commit (new) Benchmarks. ``` benchstat master.txt new.txt name old time/op new time/op delta (-ve is better. Reduction in time) no_compression-16 177ms ± 1% 177ms ± 2% ~ (p=0.690 n=5+5) encryption-16 313ms ± 4% 245ms ± 3% -21.69% (p=0.008 n=5+5) zstd_compression/level_1-16 330ms ± 1% 227ms ± 2% -31.22% (p=0.008 n=5+5) zstd_compression/level_3-16 345ms ± 2% 227ms ± 3% -34.39% (p=0.008 n=5+5) zstd_compression/level_15-16 10.7s ± 1% 1.2s ± 0% -88.96% (p=0.008 n=5+5) name old speed new speed delta (+ve is better. Speed improvement) no_compression-16 471MB/s ± 1% 471MB/s ± 2% ~ (p=0.690 n=5+5) encryption-16 266MB/s ± 4% 340MB/s ± 2% +27.67% (p=0.008 n=5+5) zstd_compression/level_1-16 252MB/s ± 1% 367MB/s ± 2% +45.40% (p=0.008 n=5+5) zstd_compression/level_3-16 241MB/s ± 2% 367MB/s ± 3% +52.44% (p=0.008 n=5+5) zstd_compression/level_15-16 7.80MB/s ± 1% 70.62MB/s ± 0% +805.33% (p=0.008 n=5+5) ``` Please look at dgraph-io/badger#1227 (comment) for the code used for benchmarking.
This PR adds support for block compression/encryptions in the background.
The following code was used for benchmarking (builder.go BenchmarkBuilder)
Master branch vs This Branch (new) Benchmarks.
This change is