Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: badger: add a has check before writing to reduce duplicates #10680

Merged
merged 1 commit into from
Apr 26, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions blockstore/badger/blockstore.go
Original file line number Diff line number Diff line change
Expand Up @@ -732,6 +732,20 @@ func (b *Blockstore) Put(ctx context.Context, block blocks.Block) error {
}

put := func(db *badger.DB) error {
// Check if we have it before writing it.
switch err := db.View(func(txn *badger.Txn) error {
_, err := txn.Get(k)
return err
}); err {
case badger.ErrKeyNotFound:
case nil:
// Already exists, skip the put.
return nil
default:
return err
}

// Then write it.
Comment on lines +735 to +748
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be relatively more efficient to do this check inside db.Update below and discarding the transaction if the key exists, instead of starting a new transaction. Although in practice it probably doesn't matter much.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that but that forces a read/write dependency (a transaction). That does guarantee no overwrites, but I was concerned about performance (IIRC, transactions in badger had some scaling issues).

err := db.Update(func(txn *badger.Txn) error {
return txn.Set(k, block.RawData())
})
Expand Down Expand Up @@ -787,12 +801,33 @@ func (b *Blockstore) PutMany(ctx context.Context, blocks []blocks.Block) error {
keys = append(keys, k)
}

err := b.db.View(func(txn *badger.Txn) error {
for i, k := range keys {
switch _, err := txn.Get(k); err {
case badger.ErrKeyNotFound:
case nil:
keys[i] = nil
default:
// Something is actually wrong
return err
}
}
return nil
})
if err != nil {
return err
}

put := func(db *badger.DB) error {
batch := db.NewWriteBatch()
defer batch.Cancel()

for i, block := range blocks {
k := keys[i]
if k == nil {
// skipped because we already have it.
continue
}
if err := batch.Set(k, block.RawData()); err != nil {
return err
}
Expand Down