Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CreateBucketIfNotExists try opening bucket before creating #504

Closed
wants to merge 1 commit into from

Conversation

cenkalti
Copy link
Member

@cenkalti cenkalti commented May 16, 2023

Fixes #118

Benchmark with 1 million keys (137 MB database):

BenchmarkCreateBucketIfNotExists-8      	     145	   8594488 ns/op	   18018 B/op	      34 allocs/op
BenchmarkCreateBucketIfNotExistsOld-8   	     144	   8510219 ns/op	   18210 B/op	      40 allocs/op

Benchmark with 10 million keys (1.2 GB database):

BenchmarkCreateBucketIfNotExists-8      	     151	   7533890 ns/op	   18262 B/op	      40 allocs/op
BenchmarkCreateBucketIfNotExistsOld-8   	     154	   7600338 ns/op	   18282 B/op	      52 allocs/op

Benchmark code: 97e5461

return b.Bucket(key), nil
} else if err != nil {
return nil, err
child := b.Bucket(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side effect: When the db is opened in readonly mode, the caller can get a valid bucket if present.

The existing implementation will return a ErrTxNotWritable error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest not to change the behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling CreateBucketIfNotExists in read-only mode doesn't make sense. I don't think anybody relied on this behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR creates a hole (readonly transaction can even create bucket (actually already exist) ) which might cause unnecessary confusion to users. It can even be regarded as a minor harmless CVE.

Please add the following code at the beginning of the method,

	if b.tx.db == nil {
		return nil, ErrTxClosed
	} else if !b.tx.writable {
		return nil, ErrTxNotWritable
	} else if len(key) == 0 {
		return nil, ErrBucketNameRequired
	}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added benchmark results. See PR description. It doesn't help with the operation speed but reduces allocations when there are many buckets which is a valid use cases where buckets are dynamically generated.

Signed-off-by: Cenk Alti <cenkalti@gmail.com>
cenkalti added a commit to cenkalti/bbolt that referenced this pull request May 18, 2023
cenkalti added a commit to cenkalti/bbolt that referenced this pull request May 18, 2023
Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks

@ahrtr ahrtr added this to the v1.4.0 milestone Jun 11, 2023
@ahrtr
Copy link
Member

ahrtr commented Jun 11, 2023

Just as #118 (comment) mentioned, it's a tradeoff. When it's highly likely the bucket doesn't exist, then CreateBucket firstly makes sense. But if it is highly likely the bucket exists, then your PR makes sense.

If you really want the PR makes sense for both cases, CreateBucketIfNotExists shouldn't reuse CreateBucket and Bucket directly; instead, it should reuse them in code level, something like below,

func (b *Bucket) CreateBucketIfNotExists(key []byte) (*Bucket, error) {
	if b.tx.db == nil {
		return nil, errors.ErrTxClosed
	} else if !b.tx.writable {
		return nil, errors.ErrTxNotWritable
	} else if len(key) == 0 {
		return nil, errors.ErrBucketNameRequired
	}

	if b.buckets != nil {
		if child := b.buckets[string(key)]; child != nil {
			return child, nil
		}
	}

	// Move cursor to correct position.
	c := b.Cursor()
	k, v, flags := c.seek(key)

	// Return an error if there is an existing non-bucket key.
	if bytes.Equal(key, k) {
		if (flags & common.BucketLeafFlag) != 0 {
			var child = b.openBucket(v)
			if b.buckets != nil {
				b.buckets[string(key)] = child
			}

			return child, nil
		}
		return nil, errors.ErrIncompatibleValue
	}

	// Create empty, inline bucket.
	var bucket = Bucket{
		InBucket:    &common.InBucket{},
		rootNode:    &node{isLeaf: true},
		FillPercent: DefaultFillPercent,
	}
	var value = bucket.write()

	// Insert into node.
	key = cloneBytes(key)
	c.node().put(key, key, value, 0, common.BucketLeafFlag)

	// Since subbuckets are not allowed on inline buckets, we need to
	// dereference the inline page, if it exists. This will cause the bucket
	// to be treated as a regular, non-inline bucket for the rest of the tx.
	b.page = nil

	return b.Bucket(key), nil
}

Attached the patch, feel free to apply the patch if it makes sense to you.
CreateBucketIfNotExists.txt

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update this PR per #504 (comment)

@cenkalti
Copy link
Member Author

@ahrtr I think the patch adds too much code (+complexity) to gain little benefit. My first intention was to optimize the code for cases where a lot of buckets are automatically generated and kept around. CreateBucketIfNotExists is what I would use instead of writing:

b = findBucket()
if not b:
    b = createBucket()

So that's why I sent this PR. We should either accept the PR as is or reject and close the issue as wontfix.

@ahrtr
Copy link
Member

ahrtr commented Jun 27, 2023

My first intention was to optimize the code for cases where a lot of buckets are automatically generated and kept around.

Just as I mentioned, you PR only applies to one case. You resolved one "issue", but introduced another "issue". It isn't accepted.

@ahrtr I think the patch adds too much code (+complexity) to gain little benefit.

The point is we should do the right thing. Although it's a little complicated, it's still under control and should be accepted.

We should either accept the PR as is or reject and close the issue as wontfix.

Let's close this PR, and raise a new one with my patch.

@ahrtr
Copy link
Member

ahrtr commented Jun 27, 2023

FYI. #532

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Shouldn't CreateBucketIfNotExists() try to open the bucket first?
2 participants