atomic Compare-and-Swap PrevNoExist, the operation failed,but the key was stored #5832

yorkart · 2016-07-01T03:28:05Z

etcd: v2.3.6

there is a demo https://github.com/yorkart/etcd-demo
when loop invoke atomic set and delete
after some times get error :

set error (but value of the key has stored in it)
set key error: 105: Key already exists (/demo/a) [1533028]
delete (indeed key has been deleted )
delete key error: 100: Key not found (/demo/a) [1530634]

The text was updated successfully, but these errors were encountered:

xiang90 · 2016-07-01T03:34:35Z

Can you format your code? Also it would be helpful if you can provide the full code block.

heyitsanthony · 2016-07-01T23:56:36Z

@yorkart I tried to reproduce this but no luck. What is opts? Are you sure that Delete isn't failing so that Set sometimes operates on a key that wasn't deleted?

yorkart · 2016-07-04T05:26:58Z

@xiang90 @heyitsanthony I have edited issue and submitted the code. Run a few times, can see the above two kinds of errors

heyitsanthony · 2016-07-05T16:27:20Z

@yorkart I still can't reproduce this bug after looping a few hundred times with that code. What do you mean by "run a few times"? Do the errors show up immediately when the program starts or does it start giving errors in the middle of a run?

yorkart · 2016-07-06T02:24:05Z

giving errors in the middle of a run

in addition, cluster log frequently prompted

failed to send out heartbeat on time (deadline exceeded for 646.014151ms)
server is likely overloaded

heartbeat config:

ETCD_HEARTBEAT_INTERVAL=1000
ETCD_ELECTION_TIMEOUT=5000

I guess the timeout has led to the operation of the result is not consistent, the cluster has been processed, but to return to the failure to the client

heyitsanthony · 2016-07-06T02:31:27Z

@yorkart that wouldn't lead to loss of consistency. Are the Set/Delete requests returning time-out errors? etcd shouldn't return success unless the request is committed. Can you please provide the full server log?

yorkart · 2016-07-06T06:38:25Z

@heyitsanthony no time-out error. As the demo code , sequential execute set , delete
get error only when set is Key already exists, delet is Key not found

I have push the server log
etcd-144
etcd-147
etcd-148

heyitsanthony · 2016-07-06T13:39:54Z

@yorkart that is a very unhealthy cluster; it's doing a leader election several times a minute. I'll see if I can reproduce under similar conditions. Do you see the same behavior with 3.0?

Old behavior would retry set and delete even if there's an error. This can lead to the client returning an error for deleting twice, instead of returning an error for an interdeterminate state. Fixes etcd-io#5832

yorkart · 2016-07-07T07:54:12Z

v3 is ok . Using the same test logic, only timed out error v3 demo
rpc error: code = 13 desc = etcdserver: request timed out, possibly due to previous leader failure

at this time, cluster get log

raft.node: 7ab00e9f791aa00a lost leader 17047805852fad33 at term 12678
raft.node: 7ab00e9f791aa00a elected leader 85ed6313fba477e3 at term 12678

in addition, when the demo is running, cluster always print log

apply entries took too long [164.283544ms for 1 entries]
avoid queries with large range/delete range!

#5871 - I have the same problem when use v3 client

heyitsanthony · 2016-07-07T13:32:22Z

@yorkart your etcd cluster has very high latencies which is why it's triggering leader elections. You'll need to increase the ETCD_ELECTION_TIMEOUT and ETCD_HEARTBEAT_INTERVAL to stop the frequent leader elections. The apply entry warning is hardcoded to 10ms; we'll probably change that to a less aggressive value soon.

Old behavior would retry set and delete even if there's an error. This can lead to the client returning an error for deleting twice, instead of returning an error for an interdeterminate state. Fixes etcd-io#5832

Old behavior would retry set and delete even if there's an error. This can lead to the client returning an error for deleting twice, instead of returning an error for an interdeterminate state. Fixes #5832

xiang90 assigned heyitsanthony Jul 5, 2016

heyitsanthony mentioned this issue Jul 7, 2016

client: make set/delete one shot operations #5888

Merged

heyitsanthony closed this as completed in #5888 Jul 7, 2016

philips unassigned heyitsanthony Aug 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

atomic Compare-and-Swap PrevNoExist, the operation failed,but the key was stored #5832

atomic Compare-and-Swap PrevNoExist, the operation failed,but the key was stored #5832

yorkart commented Jul 1, 2016 •

edited

Loading

xiang90 commented Jul 1, 2016

heyitsanthony commented Jul 1, 2016 •

edited

Loading

yorkart commented Jul 4, 2016 •

edited

Loading

heyitsanthony commented Jul 5, 2016

yorkart commented Jul 6, 2016

heyitsanthony commented Jul 6, 2016

yorkart commented Jul 6, 2016 •

edited

Loading

heyitsanthony commented Jul 6, 2016

yorkart commented Jul 7, 2016

heyitsanthony commented Jul 7, 2016

atomic Compare-and-Swap PrevNoExist, the operation failed,but the key was stored #5832

atomic Compare-and-Swap PrevNoExist, the operation failed,but the key was stored #5832

Comments

yorkart commented Jul 1, 2016 • edited Loading

xiang90 commented Jul 1, 2016

heyitsanthony commented Jul 1, 2016 • edited Loading

yorkart commented Jul 4, 2016 • edited Loading

heyitsanthony commented Jul 5, 2016

yorkart commented Jul 6, 2016

heyitsanthony commented Jul 6, 2016

yorkart commented Jul 6, 2016 • edited Loading

heyitsanthony commented Jul 6, 2016

yorkart commented Jul 7, 2016

heyitsanthony commented Jul 7, 2016

yorkart commented Jul 1, 2016 •

edited

Loading

heyitsanthony commented Jul 1, 2016 •

edited

Loading

yorkart commented Jul 4, 2016 •

edited

Loading

yorkart commented Jul 6, 2016 •

edited

Loading