Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[executor] multi-key conflict bug #837

Merged
merged 7 commits into from
Apr 17, 2024
Merged

[executor] multi-key conflict bug #837

merged 7 commits into from
Apr 17, 2024

Conversation

wlawt
Copy link
Contributor

@wlawt wlawt commented Apr 16, 2024

Consider two transactions, T1 and T2 both accessing key A and B. Suppose in the slow path, T1 and T2 have not executed.

T1 requests: A read, B write

T2 requests: A read, B write

In T1, these keys don't exist in e.nodes so we'll just add it

In T2, when we encounter A, lt is T1

We add ourselves, T2 to lt.readers for key A, and note that T1 hasn't executed so T1 is blocking T2.

For key B in T2, we go through the lt.readers now since we're a Write, and one of the rt is us, T2. So we're saying T2 is blocked on ourself

@wlawt wlawt self-assigned this Apr 16, 2024
@@ -180,6 +180,10 @@ func (e *Executor) Run(keys state.Keys, f func() error) {
// and can't mark itself as executed until all [shared] are
// cleared (which can't be done while we hold the lock for [lt]).
for _, rt := range lt.readers {
// Don't block on ourself if we just recorded us reading
Copy link
Contributor

@patrick-ogrady patrick-ogrady Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in DMs, I think there is still a weird scenario this doesn't address (assuming neither tx1 nor tx2 are executed):

tx1: A:read, B:write, C:read
tx2: A:read, B:write, C:read

It seems like in this case, we could add ourselves as a reader multiple times to the tx1 task

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should align shared/readers...they no longer sound like opposite ends of a single relationship.

// shared are the tasks that this task is using non-exclusively.
shared []*task
// reading are the tasks that this task is using non-exclusively.
reading map[int]*task
Copy link
Contributor Author

@wlawt wlawt Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can have a bool flag like accessed to see if any keys in a task have used a key non-exclusively.

this would let us keep the array approach, and not adding ourselves multiple times, and is better than the map approach since we aren't doing no-ops on that action

@@ -170,7 +170,9 @@ func (e *Executor) Run(keys state.Keys, f func() error) {
if v == state.Read {
// If we don't need exclusive access to a key, just mark
// that we are reading it and that we are a reader of it.
t.reading = append(t.reading, lt)
// TODO: currently a no-op when we add ourself multiple times
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is that bad and/or deserves a TODO.

The alternative might be trying to track this across keys but that'll get complex with locking imo.

// cleared (which can't be done while we hold the lock for [lt]).
for _, rt := range lt.readers {
// Don't block on ourself if we just recorded us reading
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Don't block on ourself if we already marked ourself as a reader

@wlawt wlawt marked this pull request as ready for review April 17, 2024 02:57
@@ -170,8 +170,6 @@ func (e *Executor) Run(keys state.Keys, f func() error) {
if v == state.Read {
// If we don't need exclusive access to a key, just mark
// that we are reading it and that we are a reader of it.
// TODO: currently a no-op when we add ourself multiple times
// but find a better way of handling [reading]
t.reading[lt.id] = lt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should note clearly the scenario for why we use a map here (i.e. that a single task can have different keys that are all just readers of another task)

// add the random keys to tx
for k := range randomConflictingKeys {
// randomly pick if conflict key is Read/Write
conflictMode := rand.Intn(2) //nolint:gosec
switch conflictMode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just do switch rand.Intn(2) {

@wlawt wlawt merged commit fd2bdde into scheduler-v3 Apr 17, 2024
19 checks passed
@wlawt wlawt deleted the scheduler-key-bug branch April 17, 2024 03:25
patrick-ogrady added a commit that referenced this pull request Apr 17, 2024
…ys (#814)

* [executor] refactor worker func

* [executor] renaming

* [executor] use blocking/dependency type from programmatic-deploy fork

* [executor] add key type support

* wip

* wip

* switch statements

* programmatic deploy fork + concurrent reads work

* sequential writes work

* w->r->r works

* r->r->w works

* w->r->r...w->r->r works

* r->r->w...r->r->w works

* all unit tests pass

* cleanup

* coverage 97%

* cleanup switch

* switch comments

* self review

* go mod tidy

* add done prints to tests

* rename isAllocateWrite

* remove id from task

* use dummy dep to keep track of all keys

* simplify for loop in adding bt dep

* fix integration bug and add unit test for it

* fix race condition with write-after-read(s) potentially

* run unit tests over multiple iterations

* cut down on unit testing time and add comments

* simplify num times we call add blocking if not exec

* new way to keep track of concurrent Reads

* self review

* have unit tests run under max time

* [executor] Simplify Logic + Speed Up Tests (#831)

* first pass

* need to set higher base dependencies

* remove extra logging from tests

* use right var

* tests passing

* cleanup task loop

* make tests faster

* ensure arr order matches

* add more comments

* tests are passing

* ensure we actually stop building early

* add comment about failure case

* add comment about deadlock

* better var names

* add larger unit tests

* ignore lint for rand

* add unique and conflicting keys randomly to txs

* fix for loops

* use max function

* make conflict keys 100 and pick 1-5

* make num slow chan consistent

* use set.Contains to speed up tests

* random perm for each unique key

* group var names

* use numTxs in generating blocking txs

* increase num conflict keys for concurrent Reads and Writes test

* [executor] multi-key conflict bug (#837)

* random perm per conflict key

* fix dont block on ourself and edit TestTwoConflictKeys to track this

* keep reading/readers relationship

* make reading a map

* comments

* clarify use of map for reading

* simplify rng for conflict and unique perm

* random perm per conflict key

* make maxDep a param

* add maxDep as const in chain

* placement of maxDep comment

---------

Co-authored-by: Patrick O'Grady <prohb125@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants