-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/metamorphic: re-use single delete keys #1248
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was born of noticing that we don't seem to reuse the keys that are singly deleted. My current understanding, related to the discussion on the ticket is that it is valid to SET -> SINGLEDEL -> SET -> ...
. This change generates such sequences.
I'll work on the SET -> RANGEDEL -> SET -> SINGLEDEL -> ...
sequencing separately. I don't think we currently generate such sequences.
Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)
internal/metamorphic/generator.go, line 161 at r1 (raw file):
func (g *generator) randKeyForWrite(newKey float64, singleSetKey float64, writerID objID) []byte { if n := len(g.keys); n > 0 && g.rng.Float64() > newKey { // 25% chance of a key that has been single deleted.
Open to suggestions on the probability here. 25% is possibly too high?
e96a15d
to
ed4574a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @itsbilal, @jbowens, and @nicktrav)
internal/metamorphic/generator.go, line 860 at r2 (raw file):
writerID: writerID, key: key, })
key
represents a SET that has made its way to the DB and was the only SET for that key (which is why it is in singleSetKeysInDB
.
However this SINGLEDEL is being generated on a writer that is not necessarily the DB itself (it could be a Batch or the DB). Until the batch somehow makes its way to the DB, it isn't fair to add it to singleDeleteKeys
(and the batch may still be open and we could then apply a SINGLEDEL after we've SET the key again -- I wouldn't be surprised if this code eventually fails the metamorphic test).
The pattern that one can follow is similar to what we do with writerToSingleSetKeys
. One possible issue is that all of this is happening at op generation time and does not take into account op failure when we actually run the ops. e.g. see the comment in writerIngest
. Op failure is harmless for the single set key tracking since worst case we think a key was set in the DB but it actually was not. I think it is fine for this singleDeleteKeys
tracking too -- the bad case of the key being written and then the SINGLEDEL not being written is harmless.
Currently, the metamorphic tests randomly generate a set of operations to perform on the Pebble instance. Support for single deletes exists, though the keys that have been deleted are not considered for reuse. Track keys that have been singly deleted, and randomly reuse these keys when generating subsequent operations to perform. Prior to this patch, the generation operation log would resemble the following: ``` db.Set("foo", "bar") ... batch54.SingleDelete("foo") ... // No more operations on key "foo". ``` With this patch, the following seqeunce of operations is permissible: ``` db.Set("foo", "bar") ... db.SingleDelete("foo") ... db.Set("foo", "baz") ... db.Merge("foo", "bam") ... db.Set("foo", "boom") ... // Subsequent operations on key "foo" are permissible. ``` Related to cockroachdb/cockroach#69414.
ed4574a
to
9ffc082
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dismissed @sumeerbhola from a discussion.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @itsbilal, @jbowens, and @sumeerbhola)
internal/metamorphic/generator.go, line 860 at r2 (raw file):
Previously, sumeerbhola wrote…
key
represents a SET that has made its way to the DB and was the only SET for that key (which is why it is insingleSetKeysInDB
.
However this SINGLEDEL is being generated on a writer that is not necessarily the DB itself (it could be a Batch or the DB). Until the batch somehow makes its way to the DB, it isn't fair to add it tosingleDeleteKeys
(and the batch may still be open and we could then apply a SINGLEDEL after we've SET the key again -- I wouldn't be surprised if this code eventually fails the metamorphic test).
The pattern that one can follow is similar to what we do withwriterToSingleSetKeys
. One possible issue is that all of this is happening at op generation time and does not take into account op failure when we actually run the ops. e.g. see the comment inwriterIngest
. Op failure is harmless for the single set key tracking since worst case we think a key was set in the DB but it actually was not. I think it is fine for thissingleDeleteKeys
tracking too -- the bad case of the key being written and then the SINGLEDEL not being written is harmless.
Reworked it. PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r3, all commit messages.
Reviewable status: 1 of 2 files reviewed, 6 unresolved discussions (waiting on @itsbilal, @jbowens, @nicktrav, and @sumeerbhola)
internal/metamorphic/generator.go, line 174 at r3 (raw file):
// 25% chance of reusing a key that has been SINGLEDELd. if m := g.singleDeleteKeysInDB.len(); m > 0 && g.rng.Float64() < 0.25 { return g.writerToSingleDeleteKeys[writerID].removeRand(g.rng)
why is this not g.singleDeleteKeysInDB
?
Hmm, I see why what you have here is also correct in that something that has been single deleted in this writer is now eligible to be written again. And so is what has been single deleted in the DB. So the union of the two is the candidate pool. See my other comment about adding a key manager abstraction.
internal/metamorphic/generator.go, line 190 at r3 (raw file):
} } else { g.generatedWriteKeys[string(key)] = struct{}{}
I think this existing code in this else block is buggy. There is nothing preventing key
from being something that is already in writerToSingleSetKeys
. The low probability has probably saved us.
And I think this tracking approach for various categories of keys is becoming a strain on the code and our comprehension of whether it is correct.
We currently have keys
, singleSetKeysInDB
, writerToSingleSetKeys
, generatedWriteKeys
and we are adding two more. I think we can take a step back and abstract all of this state into a separate struct with methods that specify the purpose for which the key is desired (read or write, and probably differentiate different kinds of write). It won't necessarily reduce the number of tracking data-structures in this separate struct but at least we will have an abstraction.
I think we need
- All keys generated for writes, to any writer. These are useful for generating reads and sometimes for writing again (only if not in the subset in the next bullet)
- Subset of the preceding keys that currently are reserved for future SingleDelete. This subset needs per writer tracking (as we have now). The sets across writers will be non-overlapping (since these are single set keys)
- Keys that have been SingleDeleted, per writer. These will be non-intersecting with each other and all of them will be a subset of the single set keys that have made it to the DB (as you have now). When the SingleDelete makes it to the DB, we can remove it from the preceding bullet's subset of keys reserved for SingleDelete. This removal will automatically make it eligible for rewriting (so slightly different from your change here to
randKeyForWrite
).
I think it is ok if this PR does not add the abstraction but we should at least narrow the use of randKey
. We use it for DELs and various read methods, and we don't want to remove the SingleDel keys for the latter.
One other thing about this key manager abstraction, and sequences like SET, SET, DEL, SET, SINGLEDEL (for the future PR). We can't really do the SINGLEDEL if the DEL is going to have an error when it runs, since then we introduce non-determinism that can cause false positive test failures. One option would be track these sets of keys at runtime too, but this seems to be getting too complicated. I can't remember any reason why a write should fail in the current metamorphic test -- I don't think there is anything invalid about any write operation that we generate (anything permitted by the interface should be valid). If you agree, it would be worth experimenting by tweaking the code and running it for a while to see if any write op is erroring. If there are no possible errors then it simplifies are life in that we can accurately track the key disposition at generation time.
internal/metamorphic/generator.go, line 749 at r3 (raw file):
} g.writerToSingleSetKeys[batchID].transferTo(g.writerToSingleSetKeys[writerID])
this needs updating to.
Worth sanity checking all the places where writerToSingleSetKeys
is referenced and seeing if any of those also need writerToSingleDeleteKeys
internal/metamorphic/generator.go, line 806 at r3 (raw file):
g.writerToSingleSetKeys[batchID].transferTo(g.singleSetKeysInDB) // The same applies for keys that have been removed by a SINGLEDEL. It // is valid to perform a read or write operation on a key that has been
nit: has been potentially SINGLEDEL'd
internal/metamorphic/utils.go, line 97 at r3 (raw file):
} // indexedSet is a collection of keys.
Can you add
// This is a set, i.e., there are no duplicate keys.
Currently, the metamorphic test generator tracks the state of keys that it generates to determine whether or not certain keys can be used for specific operations. Specifically, keys that have been set once in the database are eligible to be SingleDeleted. Future enhancements to the test suite will likely introduce other desirable sequences of operations that rely on tracking the state of certain keys through the metamorphic test. State tracking for keys is currently embedded directory onto the `generator` struct. Additional fields that are to be used for state tracking would continue to complicate the already sizable struct. There is also significant cognitive overhead with the existing state tracking for keys. Introduce the concept of a "key manager", dedicated to tracking the state of keys during metamorphic test generation time. Internally, the `keyManager` struct can be passed operations via `op`s` (e.g. `setOp`, etc.) that transform the internal state. The `keyManager` then exposes functions that can be used to deduce which keys are eligible for certain operations at a point in time. Initially, one such function is `eligibleSingleDeleteKeys`, which is used to replace the existing logic in the `generator` for tracking keys that can be SingleDeleted. Future enhancements to the metamorphic test suite that rely on state tracking can now make use of the dedicated state management responsibilities of the `keyManager`. Informed by cockroachdb#1248, cockroachdb#1254. Related to cockroachdb#1255.
Currently, the metamorphic test generator tracks the state of keys that it generates to determine whether or not certain keys can be used for specific operations. Specifically, keys that have been set once in the database are eligible to be SingleDeleted. Future enhancements to the test suite will likely introduce other desirable sequences of operations that rely on tracking the state of certain keys through the metamorphic test. State tracking for keys is currently embedded directory onto the `generator` struct. Additional fields that are to be used for state tracking would continue to complicate the already sizable struct. There is also significant cognitive overhead with the existing state tracking for keys. Introduce the concept of a "key manager", dedicated to tracking the state of keys during metamorphic test generation time. Internally, the `keyManager` struct can be passed operations via `op`s` (e.g. `setOp`, etc.) that transform the internal state. The `keyManager` then exposes functions that can be used to deduce which keys are eligible for certain operations at a point in time. Initially, one such function is `eligibleSingleDeleteKeys`, which is used to replace the existing logic in the `generator` for tracking keys that can be SingleDeleted. Future enhancements to the metamorphic test suite that rely on state tracking can now make use of the dedicated state management responsibilities of the `keyManager`. Informed by cockroachdb#1248, cockroachdb#1254. Related to cockroachdb#1255.
This should be possible now with the |
Currently, the metamorphic tests randomly generate a set of operations
to perform on the Pebble instance. Support for single deletes exists,
though the keys that have been deleted are not considered for reuse.
Track keys that have been singly deleted, and randomly reuse these keys
when generating subsequent operations to perform.
Prior to this patch, the generation operation log would resemble the
following:
With this patch, the following seqeunce of operations is permissible:
Related to cockroachdb/cockroach#69414.