Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/metamorphic: re-use single delete keys #1248

Closed
wants to merge 1 commit into from

Conversation

nicktrav
Copy link
Contributor

Currently, the metamorphic tests randomly generate a set of operations
to perform on the Pebble instance. Support for single deletes exists,
though the keys that have been deleted are not considered for reuse.

Track keys that have been singly deleted, and randomly reuse these keys
when generating subsequent operations to perform.

Prior to this patch, the generation operation log would resemble the
following:

db.Set("foo", "bar")
...
batch54.SingleDelete("foo")
...
// No more operations on key "foo".

With this patch, the following seqeunce of operations is permissible:

db.Set("foo", "bar")
...
db.SingleDelete("foo")
...
db.Set("foo", "baz")
...
db.Merge("foo", "bam")
...
db.Set("foo", "boom")
...
// Subsequent operations on key "foo" are permissible.

Related to cockroachdb/cockroach#69414.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor Author

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was born of noticing that we don't seem to reuse the keys that are singly deleted. My current understanding, related to the discussion on the ticket is that it is valid to SET -> SINGLEDEL -> SET -> .... This change generates such sequences.

I'll work on the SET -> RANGEDEL -> SET -> SINGLEDEL -> ... sequencing separately. I don't think we currently generate such sequences.

Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)


internal/metamorphic/generator.go, line 161 at r1 (raw file):

func (g *generator) randKeyForWrite(newKey float64, singleSetKey float64, writerID objID) []byte {
	if n := len(g.keys); n > 0 && g.rng.Float64() > newKey {
		// 25% chance of a key that has been single deleted.

Open to suggestions on the probability here. 25% is possibly too high?

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @itsbilal, @jbowens, and @nicktrav)


internal/metamorphic/generator.go, line 860 at r2 (raw file):

		writerID: writerID,
		key:      key,
	})

key represents a SET that has made its way to the DB and was the only SET for that key (which is why it is in singleSetKeysInDB.
However this SINGLEDEL is being generated on a writer that is not necessarily the DB itself (it could be a Batch or the DB). Until the batch somehow makes its way to the DB, it isn't fair to add it to singleDeleteKeys (and the batch may still be open and we could then apply a SINGLEDEL after we've SET the key again -- I wouldn't be surprised if this code eventually fails the metamorphic test).
The pattern that one can follow is similar to what we do with writerToSingleSetKeys. One possible issue is that all of this is happening at op generation time and does not take into account op failure when we actually run the ops. e.g. see the comment in writerIngest. Op failure is harmless for the single set key tracking since worst case we think a key was set in the DB but it actually was not. I think it is fine for this singleDeleteKeys tracking too -- the bad case of the key being written and then the SINGLEDEL not being written is harmless.

Currently, the metamorphic tests randomly generate a set of operations
to perform on the Pebble instance. Support for single deletes exists,
though the keys that have been deleted are not considered for reuse.

Track keys that have been singly deleted, and randomly reuse these keys
when generating subsequent operations to perform.

Prior to this patch, the generation operation log would resemble the
following:

```
db.Set("foo", "bar")
...
batch54.SingleDelete("foo")
...
// No more operations on key "foo".
```

With this patch, the following seqeunce of operations is permissible:

```
db.Set("foo", "bar")
...
db.SingleDelete("foo")
...
db.Set("foo", "baz")
...
db.Merge("foo", "bam")
...
db.Set("foo", "boom")
...
// Subsequent operations on key "foo" are permissible.
```

Related to cockroachdb/cockroach#69414.
Copy link
Contributor Author

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @sumeerbhola from a discussion.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @itsbilal, @jbowens, and @sumeerbhola)


internal/metamorphic/generator.go, line 860 at r2 (raw file):

Previously, sumeerbhola wrote…

key represents a SET that has made its way to the DB and was the only SET for that key (which is why it is in singleSetKeysInDB.
However this SINGLEDEL is being generated on a writer that is not necessarily the DB itself (it could be a Batch or the DB). Until the batch somehow makes its way to the DB, it isn't fair to add it to singleDeleteKeys (and the batch may still be open and we could then apply a SINGLEDEL after we've SET the key again -- I wouldn't be surprised if this code eventually fails the metamorphic test).
The pattern that one can follow is similar to what we do with writerToSingleSetKeys. One possible issue is that all of this is happening at op generation time and does not take into account op failure when we actually run the ops. e.g. see the comment in writerIngest. Op failure is harmless for the single set key tracking since worst case we think a key was set in the DB but it actually was not. I think it is fine for this singleDeleteKeys tracking too -- the bad case of the key being written and then the SINGLEDEL not being written is harmless.

Reworked it. PTAL.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r3, all commit messages.
Reviewable status: 1 of 2 files reviewed, 6 unresolved discussions (waiting on @itsbilal, @jbowens, @nicktrav, and @sumeerbhola)


internal/metamorphic/generator.go, line 174 at r3 (raw file):

		// 25% chance of reusing a key that has been SINGLEDELd.
		if m := g.singleDeleteKeysInDB.len(); m > 0 && g.rng.Float64() < 0.25 {
			return g.writerToSingleDeleteKeys[writerID].removeRand(g.rng)

why is this not g.singleDeleteKeysInDB?
Hmm, I see why what you have here is also correct in that something that has been single deleted in this writer is now eligible to be written again. And so is what has been single deleted in the DB. So the union of the two is the candidate pool. See my other comment about adding a key manager abstraction.


internal/metamorphic/generator.go, line 190 at r3 (raw file):

		}
	} else {
		g.generatedWriteKeys[string(key)] = struct{}{}

I think this existing code in this else block is buggy. There is nothing preventing key from being something that is already in writerToSingleSetKeys. The low probability has probably saved us.

And I think this tracking approach for various categories of keys is becoming a strain on the code and our comprehension of whether it is correct.
We currently have keys, singleSetKeysInDB, writerToSingleSetKeys, generatedWriteKeys and we are adding two more. I think we can take a step back and abstract all of this state into a separate struct with methods that specify the purpose for which the key is desired (read or write, and probably differentiate different kinds of write). It won't necessarily reduce the number of tracking data-structures in this separate struct but at least we will have an abstraction.

I think we need

  • All keys generated for writes, to any writer. These are useful for generating reads and sometimes for writing again (only if not in the subset in the next bullet)
  • Subset of the preceding keys that currently are reserved for future SingleDelete. This subset needs per writer tracking (as we have now). The sets across writers will be non-overlapping (since these are single set keys)
  • Keys that have been SingleDeleted, per writer. These will be non-intersecting with each other and all of them will be a subset of the single set keys that have made it to the DB (as you have now). When the SingleDelete makes it to the DB, we can remove it from the preceding bullet's subset of keys reserved for SingleDelete. This removal will automatically make it eligible for rewriting (so slightly different from your change here to randKeyForWrite).

I think it is ok if this PR does not add the abstraction but we should at least narrow the use of randKey. We use it for DELs and various read methods, and we don't want to remove the SingleDel keys for the latter.

One other thing about this key manager abstraction, and sequences like SET, SET, DEL, SET, SINGLEDEL (for the future PR). We can't really do the SINGLEDEL if the DEL is going to have an error when it runs, since then we introduce non-determinism that can cause false positive test failures. One option would be track these sets of keys at runtime too, but this seems to be getting too complicated. I can't remember any reason why a write should fail in the current metamorphic test -- I don't think there is anything invalid about any write operation that we generate (anything permitted by the interface should be valid). If you agree, it would be worth experimenting by tweaking the code and running it for a while to see if any write op is erroring. If there are no possible errors then it simplifies are life in that we can accurately track the key disposition at generation time.


internal/metamorphic/generator.go, line 749 at r3 (raw file):

	}

	g.writerToSingleSetKeys[batchID].transferTo(g.writerToSingleSetKeys[writerID])

this needs updating to.
Worth sanity checking all the places where writerToSingleSetKeys is referenced and seeing if any of those also need writerToSingleDeleteKeys


internal/metamorphic/generator.go, line 806 at r3 (raw file):

		g.writerToSingleSetKeys[batchID].transferTo(g.singleSetKeysInDB)
		// The same applies for keys that have been removed by a SINGLEDEL. It
		// is valid to perform a read or write operation on a key that has been

nit: has been potentially SINGLEDEL'd


internal/metamorphic/utils.go, line 97 at r3 (raw file):

}

// indexedSet is a collection of keys.

Can you add
// This is a set, i.e., there are no duplicate keys.

nicktrav added a commit to nicktrav/pebble that referenced this pull request Sep 17, 2021
Currently, the metamorphic test generator tracks the state of keys that
it generates to determine whether or not certain keys can be used for
specific operations. Specifically, keys that have been set once in the
database are eligible to be SingleDeleted. Future enhancements to the
test suite will likely introduce other desirable sequences of operations
that rely on tracking the state of certain keys through the metamorphic
test.

State tracking for keys is currently embedded directory onto the
`generator` struct. Additional fields that are to be used for state
tracking would continue to complicate the already sizable struct. There
is also significant cognitive overhead with the existing state tracking
for keys.

Introduce the concept of a "key manager", dedicated to tracking the
state of keys during metamorphic test generation time. Internally, the
`keyManager` struct can be passed operations via `op`s` (e.g. `setOp`,
etc.) that transform the internal state. The `keyManager` then exposes
functions that can be used to deduce which keys are eligible for certain
operations at a point in time. Initially, one such function is
`eligibleSingleDeleteKeys`, which is used to replace the existing logic
in the `generator` for tracking keys that can be SingleDeleted.

Future enhancements to the metamorphic test suite that rely on state
tracking can now make use of the dedicated state management
responsibilities of the `keyManager`.

Informed by cockroachdb#1248, cockroachdb#1254.

Related to cockroachdb#1255.
sumeerbhola pushed a commit to nicktrav/pebble that referenced this pull request Sep 23, 2021
Currently, the metamorphic test generator tracks the state of keys that
it generates to determine whether or not certain keys can be used for
specific operations. Specifically, keys that have been set once in the
database are eligible to be SingleDeleted. Future enhancements to the
test suite will likely introduce other desirable sequences of operations
that rely on tracking the state of certain keys through the metamorphic
test.

State tracking for keys is currently embedded directory onto the
`generator` struct. Additional fields that are to be used for state
tracking would continue to complicate the already sizable struct. There
is also significant cognitive overhead with the existing state tracking
for keys.

Introduce the concept of a "key manager", dedicated to tracking the
state of keys during metamorphic test generation time. Internally, the
`keyManager` struct can be passed operations via `op`s` (e.g. `setOp`,
etc.) that transform the internal state. The `keyManager` then exposes
functions that can be used to deduce which keys are eligible for certain
operations at a point in time. Initially, one such function is
`eligibleSingleDeleteKeys`, which is used to replace the existing logic
in the `generator` for tracking keys that can be SingleDeleted.

Future enhancements to the metamorphic test suite that rely on state
tracking can now make use of the dedicated state management
responsibilities of the `keyManager`.

Informed by cockroachdb#1248, cockroachdb#1254.

Related to cockroachdb#1255.
@nicktrav
Copy link
Contributor Author

This should be possible now with the keyManger. Closing.

@nicktrav nicktrav closed this Sep 28, 2021
@nicktrav nicktrav deleted the nickt.meta-single-del branch September 28, 2021 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants