Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Why STM did not support range get yet #7383

Closed
arthurkiller opened this issue Feb 27, 2017 · 22 comments
Closed

Question: Why STM did not support range get yet #7383

arthurkiller opened this issue Feb 27, 2017 · 22 comments

Comments

@arthurkiller
Copy link

I have read the STM code in etcdv3 , and gonna to use in my project.

But here is a big problem there: I need the range get in STM .

Why there is no range get in STM ? Can I add this feature with a pull request?

@arthurkiller
Copy link
Author

@xiang90

@arthurkiller arthurkiller changed the title Question: Why STM did not support range get Question: Why STM did not support range get yet Feb 27, 2017
@heyitsanthony
Copy link
Contributor

heyitsanthony commented Feb 27, 2017

@arthurkiller it's written that way because there's a potential for a Range to fetch a very large number of keys which can lead to a large number of false conflicts due to a huge readset and poor performance from very large transactional compare blocks. How do you intend to use STM?

@arthurkiller
Copy link
Author

arthurkiller commented Feb 28, 2017

@heyitsanthony thanks a lot ,dude. I need the STM to handle the potential conflicts while doing the update in a parallel way. We think this should let the user to decide if a large number false is acceptable.

I have also make up a STM-like facilities , it provide the isolation ability between the read committed & repeatable read Cause we really need the range get and doesn't care the failure much . In this situation etcd still get a good performance

Etcd can be used in many situations and the failure acceptable is one of them I think.

@heyitsanthony
Copy link
Contributor

@arthurkiller Would an extension to the STM interface that takes a range [begin, end) and returns a slice of key/value pairs like Range(begin, end string) [][2]string be enough?

@arthurkiller
Copy link
Author

@heyitsanthony Actually I only need the prefix option and don't care the implementation

@arthurkiller
Copy link
Author

@heyitsanthony would like a PR ? I can do this

@heyitsanthony
Copy link
Contributor

@arthurkiller sure

@arthurkiller
Copy link
Author

arthurkiller commented Feb 28, 2017

@heyitsanthony I want to use the AVL tree / dictionary tree instead of the map(hash map). Cause Tree can handle the prefix matching problem with less cost. How do you think about this? This may add another dependency in the package

@heyitsanthony
Copy link
Contributor

@arthurkiller there's already an interval tree in pkg/adt/interval_tree.go. I think that kind of approach will be very complicated though. An easier way: when calling Range(begin, end string) [][2]string, the STM runtime would cache the result for the begin/end pair and that's it.

One problem with Range and STM (and why I'm hesitant to diverge from the traditional single-key model): suppose a transaction T1 makes a range request [a, d), retrieving {a, b}. Concurrently, another transaction T2 inserts "c" and commits, giving {a, b, c}; there's now a read conflict on T1. How do you intend to resolve it?

@heyitsanthony heyitsanthony self-assigned this Feb 28, 2017
@arthurkiller
Copy link
Author

arthurkiller commented Mar 2, 2017

@heyitsanthony This makes me confused. Question is that Serilizable read should not allow phantom read, right?

@arthurkiller
Copy link
Author

@heyitsanthony You mean T1 do the read {a,b,c} but only get the {a,b} revison and T2 write the {c} then T1 should get a conflict ???

@arthurkiller
Copy link
Author

@heyitsanthony I see ... Seems snapshot can solve this?

@arthurkiller
Copy link
Author

arthurkiller commented Mar 2, 2017

@heyitsanthony Should serializable read add the revision(global) check when commit? That's not a good way , cause we get a global RWLock over etcd version.But it can avoid phantom read ,right ?

@heyitsanthony
Copy link
Contributor

Should serializable read add the revision(global) check when commit? That's not a good way , cause we get a global RWLock over etcd version.But it can avoid phantom read ,right ?

That would catch the read conflict, but would have so many false conflits it would be effectively unusable. Plus, that kind of test isn't supported in the transaction API and probably shouldn't be; if the txn model is going to be extended it should include range tests (e.g., Range(a, b).modrev < r).

Can you give a concrete example of how you're planning on using etcd / what your data model is going to be? Maybe STM isn't the best fit.

@arthurkiller
Copy link
Author

@heyitsanthony Sure. We make up a config manager service over etcd. And here is a problem. We need to identify&manage a kind of keys(this key is the config key ,they may have a common prefix) get same prefix.We also defined some status while operating these keys.And those operation should be concurrent safety

Now I do this with a additional key to hold the revision while do the transections to isolate the transections.And it really take effect.But till I read the STM ,I realized my isolation facilities is just between the read-committed & serializable-read.

Maybe here is a another way to handle this

@heyitsanthony
Copy link
Contributor

@arthurkiller was this the revision of the entire etcd store? that would have a lot of conflicts...

Maybe use a distributed lock? The one in clientv3/concurrency/mutex.go can guard concurrent transactions with IsOwner().

@arthurkiller
Copy link
Author

@heyitsanthony Yep, seems STM is not suit for this kind of problem.
So, what kind of problem is suit for STM??

@heyitsanthony
Copy link
Contributor

@arthurkiller it's suitable for problems which can be solved with ordinary software transactional memory. If the algorithm accesses keys like addresses in memory, it's OK; range queries are not part of that model.

@arthurkiller
Copy link
Author

@heyitsanthony Thx dude. I will think about this.

@heyitsanthony
Copy link
Contributor

Reopening since the txn API can do this as of #8025 and people still ask for this STM extension.

@halseth
Copy link

halseth commented Oct 9, 2019

Also interested in this :)

@stale
Copy link

stale bot commented Apr 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 6, 2020
@stale stale bot closed this as completed Apr 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants