Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock when updating keys while iterating over them #315

Closed
jogramming opened this issue Nov 10, 2017 · 7 comments
Closed

Deadlock when updating keys while iterating over them #315

jogramming opened this issue Nov 10, 2017 · 7 comments
Assignees
Labels
area/documentation Documentation related issues.

Comments

@jogramming
Copy link

jogramming commented Nov 10, 2017

I don't see it stated anywhere that updating values while iterating isn't supported, so i'm assuming i'm not doing anything wrong?

Using the latest git version as of typing.

Running the following for a while (usually 10 seconds for me) will cause a deadlock (when the counter stays at 0 every second):

package main

import (
	"fmt"
	"github.com/dgraph-io/badger"
	"math/rand"
	"os"
	"os/signal"
	"sync/atomic"
	"syscall"
	"time"
)

var (
	DB          *badger.DB
	updateCount = new(int64)
)

func panicErr(err error) {
	if err != nil {
		panic(err)
	}
}

func main() {
	opts := badger.DefaultOptions
	opts.Dir = "db"
	opts.ValueDir = "db"
	opts.SyncWrites = false

	var err error
	DB, err = badger.Open(opts)
	panicErr(err)

	Load(0, 0xffff)
	go ReaderWriter()

	sc := make(chan os.Signal, 1)
	signal.Notify(sc, syscall.SIGINT, syscall.SIGTERM, os.Interrupt, os.Kill)

	ticker := time.NewTicker(time.Second)
	for {
		select {
		case <-sc:
			DB.Close()
			return
		case <-ticker.C:
			numRead := atomic.SwapInt64(updateCount, 0)
			fmt.Println("Read: ", numRead)
		}
	}
}

const vauleSize = 0xff

func Load(startOffset int64, amount int64) {

	d := make([]byte, vauleSize)

	for i := int64(0); i < amount; i++ {
		panicErr(DB.Update(func(txn *badger.Txn) error {
			rand.Read(d)
			return txn.Set([]byte(fmt.Sprint(i+startOffset)), d)
		}))
	}
}

func ReaderWriter() {
	for {
		// Iterate over the db reading and upadting values
		err := DB.Update(func(txn *badger.Txn) error {

			opts := badger.DefaultIteratorOptions
			opts.PrefetchValues = false
			it := txn.NewIterator(opts)
			defer it.Close()
			i := 0
			for it.Rewind(); it.Valid(); it.Next() {

				i++
				if i >= 100000 {
					break
				}

				atomic.AddInt64(updateCount, 1)

				item := it.Item()
				key := item.Key()

				newValue := make([]byte, vauleSize)
				rand.Read(newValue)
				item.Value()
				err := txn.Set(key, newValue)
				if err != nil {
					return err
				}
			}

			return nil
		})
		panicErr(err)
	}
}
@jogramming
Copy link
Author

I believe the issue seems to be the callback, that releases the rlock on the logFile, put into the transaction callbacks here https://github.com/dgraph-io/badger/blob/master/iterator.go#L84 is not called before calling batchSet here https://github.com/dgraph-io/badger/blob/master/transaction.go#L411 which sometimes asks for the lock again here https://github.com/dgraph-io/badger/blob/master/value.go#L130

@manishrjain
Copy link
Contributor

You have an item.Value() in there. This returns a value, acquiring read locks on the value log to avoid value log GC causing the file to be removed. What's happening is that the writes in this long-running iteration are still being written out to the value log, which gets filled up and needs to be replaced, but that isn't possible given the read locks.

You could avoid this by two things:

  1. Set PrefetchValues to true. This would read the value, copy it over and release the read lock.
  2. Do a pure key-only iteration, without calling Value().
  3. Store the keys that you need to write in a list, and then write them in a separate transaction after the iteration is over.

I'm seeing users getting stuck with this issue repeatedly. So, it would make sense to have another function on Item, where a user can pass in a byte slice which can be written to; thus the read lock can be released immediately.

We also need to have a warning about this behavior in the README.

@manishrjain manishrjain added the area/documentation Documentation related issues. label Nov 10, 2017
@manishrjain manishrjain self-assigned this Nov 10, 2017
@jogramming
Copy link
Author

I'm also seeing this issue occuring outside of iterators, rarely (but not so rarely in my use case) calling Value() on Item and then updating it in the same transaction, I'm a bit too tired to make a minimal reproducable case right now but i can get to it in the morning probably.

@manishrjain
Copy link
Contributor

manishrjain commented Nov 11, 2017

This pending PR would add a ValueCopy function, which can be used to avoid deadlocks in long-running updating iterations.

Update: @jonas747 Can you run with my PR and check if you still see any deadlocks using the new ValueCopy function?

@jogramming
Copy link
Author

It dosen't seem to happen anymore as far as i can see after using ValueCopy, but if i can make a request, it would be nice if we could pass our own say io.Writer so we could reuse that buffer.

@manishrjain
Copy link
Contributor

Just pushed another commit, which modifies the function to accept a byte slice, where the data would be written.

@manishrjain
Copy link
Contributor

The commit is in b3568eb. @deepakjois would cut a release branch sometime soon. Documentation is updated. Marking this issue as closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Documentation related issues.
Development

No branches or pull requests

2 participants