What if Sets were Docs and Docs were Sets ? #23

Floby · 2013-07-02T15:47:04Z

I finally found time to give a go at the scuttlebutt family.

I'm for the moment thinking of what I'm going to do with stuff and all but basically I'm gonna use CRDT docs as sets of models (each song in my playlist is a row in a set).

I don't understand much about the science behind all this but is it doable to be able to call createStream() on a set instead of the whole document ? actually making Sets and Docs kind of the same thing. This would allows some cool use cases like

synchronize only a subset of a document with some peers
operations on set: intersect, join, difference, etc.
probably some extension of the first two...

I see the whole scuttlebutt stuff have hardly been touched in months. are you busy with some other mad science project @dominictarr ? :)

dominictarr · 2013-07-02T16:35:31Z

Yes. I've been working on level-* stuff recently.
The scuttlebutt stuff hasn't changed because they are mostly finished!

If you want just a set, try r-array.

The idea with a scuttlebutt object is that it corresponds to the amount of data that you might store in one document in a document database. Storing many user's data in one document doesn't really make sense.

Also, because of the way the replication protocol works, it's either all or nothing - you can't replicate part of a dataset.
This is something I am thinking about, but I think you'd have to sort your data into "feeds", which is essentially the same as a separate scuttlebutt per feed/document.

mbrevoort · 2013-07-03T04:01:53Z

I too have been struggling to wrap my head around how to model similar use cases where I presume I have a single replicated data set but want to replicate subsets of it to/from to different clients. The distinction that a scuttlebutt is replicated in full and that it's oriented around what would be appropriate for a typical document in a document based key/value store is a very good depiction of the constraint. It's also an interesting way to think about authorization. You either have read or write privilege on the entire scuttlebutt. This is similar to how the Google Drive realtime API works.

I started experimenting today with an approach that would have 100's of scuttlebutts from as many nodes replicated to a sort of an aggregate node that would stream those updates over a mux/demux stream. I'll let you know how that goes.

Floby · 2013-07-03T06:58:19Z

my use case is for synchronising an Ember.js model. the question arose when
I thought about making this a general solution.
Le 3 juil. 2013 06:02, "Mike Brevoort" notifications@github.com a écrit :

I too have been struggling to wrap my head around how to model similar use
cases where I presume I have a single replicated data set but want to
replicate subsets of it to/from to different clients. The distinction that
a scuttlebutt is replicated in full and that it's oriented around what
would be appropriate for a typical document in a document based key/value
store is a very good depiction of the constraint. It's also an interesting
way to think about authorization. You either have read or write privilege
on the entire scuttlebutt. This is similar to how the Google Drive realtime
API works.

I started experimenting today with an approach that would have 100's of
scuttlebutts from as many nodes replicated to a sort of an aggregate node
that would stream those updates over a mux/demux stream. I'll let you know
how that goes.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/23#issuecomment-20394394
.

dominictarr · 2013-07-08T11:37:36Z

@mbrevoort how large are the datasets? just today I've been working on a new replication algorithm that will be amenable to replicating overlaping subsets of data.

https://github.com/dominictarr/level-merkle

(I have only imagined the subsets feature currently, but it will work fine with the properties of merkle trees)

mbrevoort · 2013-07-08T23:39:58Z

@dominictarr that looks interesting. In my current adventure, the size of the datasets are pretty small. In something I was previously considering it could fairly large 100's of 1000's of rows or millions or rows and rapidly changing. The filtered subsets would be 100-1000ish in size.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What if Sets were Docs and Docs were Sets ? #23

What if Sets were Docs and Docs were Sets ? #23

Floby commented Jul 2, 2013

dominictarr commented Jul 2, 2013

mbrevoort commented Jul 3, 2013

Floby commented Jul 3, 2013

dominictarr commented Jul 8, 2013

mbrevoort commented Jul 8, 2013

What if Sets were Docs and Docs were Sets ? #23

What if Sets were Docs and Docs were Sets ? #23

Comments

Floby commented Jul 2, 2013

dominictarr commented Jul 2, 2013

mbrevoort commented Jul 3, 2013

Floby commented Jul 3, 2013

dominictarr commented Jul 8, 2013

mbrevoort commented Jul 8, 2013