-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seek iterator with Reverse doesn't work #436
Comments
@brk0v That's the expected behaviour. Seek searches for a key greater than or equal to the specifed key. Say you have the following keys "00" ,"01", "02". When you seek for "01" in reverse mode the iteration would start at "00". |
Ok. Got it. |
You still can. You just need to choose the seek key wisely. Following from @janardhan1993's example, if you have [00, 01, 02], and you want to start from 01, then you can use |
He wanted to get the last item in the bucket, not the first, @manishrjain - I think, you misunderstood his statement. Getting the reverse latest data from badgerdb is still Pain IA. |
It's all about the choice of seek key. If you want the first data from reverse, you can just call Rewind after setting Reverse option. |
There is no way to |
You can set reverse and then Seek to prefix + 0xFF byte. That'd bring you to prefix. |
@manishrjain I realize this comment is going on 3 years old, but it's subtly incorrect, and reverse iterating over a prefix with The method suggested in this issue: to find the last key with prefix
I believe the solution that accounts for all cases is much more involved.
DB without exact match on incremented key:
DB with exact match on incremented key:
|
@AlexMackowiak I don't think what you have will work correctly for case 1c. I've tried ... it doesn't play ball. If your prefix is |
@msackman Yeah you're actually right, I stumbled across this myself in some tests about a week later and I guess I forgot to follow up here. I believe the actual case 1c that I ended up writing in my code was just to append as many |
@AlexMackowiak Yep. I think in conclusion it's actually impossible to seek to the last item in a database without prior knowledge of the key length. Personally, I'm switching back to BoltDB as that seems to have a much better designed API IMO. |
Just wanted to post correct code for reverse iteration. func incrementPrefix(prefix []byte) []byte {
result := make([]byte, len(prefix))
copy(result, prefix)
var len = len(prefix)
for len > 0 {
if result[len-1] == 0xFF {
len -= 1
} else {
result[len-1] += 1
break
}
}
return result[0:len]
}
// Seeks to the last key that is valid for the given prefix. For
// correct results iterator must be setup to iterate iterate in the
// reverse direction and must not be limited to a certain prefix
func SeekLast(it *badger.Iterator, prefix []byte) {
i := incrementPrefix(prefix);
it.Seek(i);
if (it.Valid() && bytes.Equal(i, it.Item().Key())) {
it.Next();
}
} Example Usage func example(db *badger.DB, prefix []byte) error {
return db.View(func(txn *badger.Txn) error {
opts := badger.DefaultIteratorOptions
opts.Reverse = true
var it = txn.NewIterator(opts)
defer it.Close()
for SeekLast(it, prefix); it.ValidForPrefix(prefix); it.Next() {
// Do stuff with item...
}
return nil
})
} Test func Test_seekForReverseIterate(t *testing.T) {
type args struct {
prefix []byte
}
tests := []struct {
name string
args args
want []byte
}{
{"Standard Case", args{[]byte{0x05, 0x45, 0x77}}, []byte{0x05, 0x45, 0x78}},
{"Some 0xFF", args{[]byte{0x05, 0xFF, 0xFF}}, []byte{0x06}},
{"All 0XFF", args{[]byte{0xFF, 0xFF, 0xFF}}, []byte{}},
{"Empty", args{[]byte{}}, []byte{}},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := incrementPrefix(tt.args.prefix); !reflect.DeepEqual(got, tt.want) {
t.Errorf("seekForReverseIterate() = %v, want %v", got, tt.want)
}
})
}
}
func getKeys(db *badger.DB, prefix []byte) ([][]byte) {
result := [][]byte{};
err := db.View(func(txn *badger.Txn) error {
opts := badger.DefaultIteratorOptions
opts.Reverse = true
var it = txn.NewIterator(opts)
defer it.Close()
for SeekLast(it, prefix); it.ValidForPrefix(prefix); it.Next() {
result = append(result, it.Item().KeyCopy(nil));
}
return nil
})
if (err != nil) {
log.Fatal(err);
}
return result;
}
func Test_SeekLast(t *testing.T) {
// construct test database in memory
db, err := badger.Open(badger.DefaultOptions("").WithInMemory(true));
if err != nil {
log.Fatal(err)
}
defer db.Close();
keys := [][]byte{
{0xFF, 0xFF, 0xFF, 0xFF, 0xFF},
{0xFF, 0x1, 0xFF},
{0xFF, 0x1},
{0xFF},
{0x2},
{0x1, 0xFF, 0xFF, 0xFF, 0xFF},
{0x1, 0x1, 0xFF},
{0x1, 0x1},
{0x1},
}
err = db.Update(func(txn *badger.Txn) error {
for _, key := range keys {
err := txn.Set(key, []byte{});
if (err != nil) {
log.Fatal(err.Error());
}
}
return nil;
});
if (err != nil) {
log.Fatal(err);
}
// Function to get all keys
empty := [][]byte{};
// Do tests
type args struct {
prefix []byte
}
tests := []struct {
name string
args args
want [][]byte
}{
{"Prefix [0x01]", args{[]byte{0x01}}, keys[5:]},
{"Prefix []", args{[]byte{}}, keys},
{"Prefix nil", args{nil}, keys},
{"Prefix [0x00]", args{[]byte{0x00}}, empty},
{"Prefix [0x01 0x01]", args{[]byte{0x01, 0x01}}, keys[6:8]},
{"Prefix [0x02]", args{[]byte{0x02}}, keys[4:5]},
{"Prefix [0xFF]", args{[]byte{0xFF}}, keys[:4]},
{"Prefix [0xFF 0xFF]", args{[]byte{0xFF, 0xFF}}, keys[:1]},
{"Prefix [0xFF 0xFF 0xFF 0xFF 0xFF 0xFF]", args{[]byte{0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}}, empty},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := getKeys(db, tt.args.prefix); !reflect.DeepEqual(got, tt.want) {
t.Errorf("getKeys(%v) = %v, want %v", tt.args, got, tt.want)
}
})
}
} FWIW, I believe Edit - 3/20/2022 +Updated implementation and test coverage of |
@PeterMcKinnis I believe this solution is currently missing one case and loses some efficiency over optimal reverse prefix scanning, great idea though we should definitely post some code for a correct implementation!
The code I ended up writing to account for everything in the reverse prefix case is the following:
EDIT: Even this is sometimes invalid for the case where the prefix is all 0xFF. Our goal here is to point the iterator to the last key in the database, workarounds to |
I updated
|
I replaced your
I also have extensive test cases on this particular case, although they were written as part of a much larger, proprietary system that I wouldn't be able to share here. I did not however have test coverage of the following case!
My code was incorrect in this case, and to be truly correct you would need to know the length of the longest key in the database, as a workaround to |
Code:
This code returns 0 elements in reverse mode and 1000 elements with reverse = false
The text was updated successfully, but these errors were encountered: