Skip to content
This repository has been archived by the owner on Nov 12, 2019. It is now read-only.

What if Sets were Docs and Docs were Sets ? #23

Open
Floby opened this issue Jul 2, 2013 · 5 comments
Open

What if Sets were Docs and Docs were Sets ? #23

Floby opened this issue Jul 2, 2013 · 5 comments

Comments

@Floby
Copy link

Floby commented Jul 2, 2013

I finally found time to give a go at the scuttlebutt family.

I'm for the moment thinking of what I'm going to do with stuff and all but basically I'm gonna use CRDT docs as sets of models (each song in my playlist is a row in a set).

I don't understand much about the science behind all this but is it doable to be able to call createStream() on a set instead of the whole document ? actually making Sets and Docs kind of the same thing. This would allows some cool use cases like

  • synchronize only a subset of a document with some peers
  • operations on set: intersect, join, difference, etc.
  • probably some extension of the first two...

I see the whole scuttlebutt stuff have hardly been touched in months. are you busy with some other mad science project @dominictarr ? :)

@dominictarr
Copy link
Owner

Yes. I've been working on level-* stuff recently.
The scuttlebutt stuff hasn't changed because they are mostly finished!

If you want just a set, try r-array.

The idea with a scuttlebutt object is that it corresponds to the amount of data that you might store in one document in a document database. Storing many user's data in one document doesn't really make sense.

Also, because of the way the replication protocol works, it's either all or nothing - you can't replicate part of a dataset.
This is something I am thinking about, but I think you'd have to sort your data into "feeds", which is essentially the same as a separate scuttlebutt per feed/document.

@mbrevoort
Copy link
Collaborator

I too have been struggling to wrap my head around how to model similar use cases where I presume I have a single replicated data set but want to replicate subsets of it to/from to different clients. The distinction that a scuttlebutt is replicated in full and that it's oriented around what would be appropriate for a typical document in a document based key/value store is a very good depiction of the constraint. It's also an interesting way to think about authorization. You either have read or write privilege on the entire scuttlebutt. This is similar to how the Google Drive realtime API works.

I started experimenting today with an approach that would have 100's of scuttlebutts from as many nodes replicated to a sort of an aggregate node that would stream those updates over a mux/demux stream. I'll let you know how that goes.

@Floby
Copy link
Author

Floby commented Jul 3, 2013

my use case is for synchronising an Ember.js model. the question arose when
I thought about making this a general solution.
Le 3 juil. 2013 06:02, "Mike Brevoort" notifications@github.com a écrit :

I too have been struggling to wrap my head around how to model similar use
cases where I presume I have a single replicated data set but want to
replicate subsets of it to/from to different clients. The distinction that
a scuttlebutt is replicated in full and that it's oriented around what
would be appropriate for a typical document in a document based key/value
store is a very good depiction of the constraint. It's also an interesting
way to think about authorization. You either have read or write privilege
on the entire scuttlebutt. This is similar to how the Google Drive realtime
API works.

I started experimenting today with an approach that would have 100's of
scuttlebutts from as many nodes replicated to a sort of an aggregate node
that would stream those updates over a mux/demux stream. I'll let you know
how that goes.


Reply to this email directly or view it on GitHubhttps://github.com//issues/23#issuecomment-20394394
.

@dominictarr
Copy link
Owner

@mbrevoort how large are the datasets? just today I've been working on a new replication algorithm that will be amenable to replicating overlaping subsets of data.

https://github.com/dominictarr/level-merkle

(I have only imagined the subsets feature currently, but it will work fine with the properties of merkle trees)

@mbrevoort
Copy link
Collaborator

@dominictarr that looks interesting. In my current adventure, the size of the datasets are pretty small. In something I was previously considering it could fairly large 100's of 1000's of rows or millions or rows and rapidly changing. The filtered subsets would be 100-1000ish in size.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants