-
Notifications
You must be signed in to change notification settings - Fork 1
/
index.html
238 lines (183 loc) · 11.9 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
<!DOCTYPE html>
<html>
<head>
<title>nookeeper - An open-source Zookeeper-style distributed coordination system with strong consistency guarantees (built in 48h at NKO 2012)
</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link href='http://fonts.googleapis.com/css?family=Gudea' rel='stylesheet' type='text/css'>
<link type="text/css" rel="stylesheet" href="http://mixu.net/assets/css/gh.css"/>
<style type="text/css">
#title h1 {
background-color: #69245C;
}
.ca-menu li:hover{
border-color: #69245C;
}
.ca-menu li:hover .ca-icon{
color: #69245C;
text-shadow: 0px 0px 1px #69245C;
}
.ca-menu li:hover .ca-main{
color: #69245C;
}
</style>
<link rel="stylesheet" href="http://mixu.net/assets/css/font-awesome.css">
<link type="text/css" rel="stylesheet" href="http://mixu.net/assets/css/prettify.css"/>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js"></script>
<script type="text/javascript" src="http://mixu.net/assets/js/prettify.js"></script>
</head>
<body onload="prettyPrint()">
<div id="wrapper">
<div id="header">
<div id="title">
<h1>nko 2012 - foojs</h1>
<h2>A distributed key-value store and a distributed coordination service</h2>
<iframe src="http://nodeknockout.com/iframe/foojs" frameborder=0 scrolling=no allowtransparency=true width=115 height=25>
</iframe>
</div>
<div id="navi">
<ul class="ca-menu">
<li>
<a href="http://mixu.net/">
<span class="ca-icon"><i class="icon-search"></i></span>
<div class="ca-content">
<h2 class="ca-main">Other stuff</h2>
<h3 class="ca-sub">Check out other projects by me</h3>
</div>
</a>
</li>
</ul>
</div>
</div>
<div class="clear"></div>
<div id="main">
<div id="content" class="post">
<!-- Content -->
<p>Hi there, I'm <a href="http://github.com/mixu">mixu</a>.</p>
<p>My Node Knockout project is a distributed key-value store (and a coordination service).</p>
<p>It's an eventually consistent system where servers are equal peers. Keys are mapped to the servers via a DHT. Reads and writes reach a sloppy quorum of the servers.</p>
<p>I've basically spent quite a while reading up on this stuff, and then started from scratch and wrote a working key-value store and a slighly broken coordination service.</p>
<h3>Two open source distributed systems: kv and nookeeper</h2>
<p>There are two open source projects resulting from this weekend:</p>
<ul class="list">
<li><a href="http://mixu.net/kv/">kv</a>: a distributed, eventually consistent key-value store, modeled after Amazon's Dynamo</li>
<li><a href="http://mixu.net/nookeeper/">nookeeper</a>: a distributed, strongly consistent coordination service, modeled after Apache Zookeeper</li>
</ul>
<p>If the links above don't work, it's because I haven't yet gotten the docs up on my own site. You can find the readmes below or <a href="https://github.com/nko3/foojs">on github</a></p>
<h3>Future npm modules produced as a side product</h3>
<p>I wrote following things in order to write kv and nookeeper. They will probably become npm modules for the following algorithms:</p>
<ul class="list">
<li><a href="https://github.com/nko3/foojs/tree/master/zab">Zookeeper atomic broadcast</a> (similar to Paxos, requires TCP, more strict ordering guarantees)</li>
<li><a href="https://github.com/nko3/foojs/tree/master/partitioning">Consistent hashing</a> with pluggable partioning schemes (basically, strategy 3 in the Amazon Dynamo paper)</li>
<li><a href="https://github.com/nko3/foojs/tree/master/sloppyquorum">Sloppy quorum implementation</a> (given a set of servers, perform a quorum write or read at a given minimum number of participants)</li>
<li><a href="https://github.com/nko3/foojs/tree/master/vectorclock">Vector clocks</a> (a simple library for sorting, comparing and resolving version conflicts automatically where possible using vector clocks)</li>
<li><a href="https://github.com/nko3/foojs/tree/master/failuredetect">Failure detector</a> (basically a tracker that receives events from multiple TCP connections and manages timeouts locally so that failed servers are detected - works with multiple connections)</li>
<li><a href="https://github.com/nko3/foojs/tree/master/sstable">Memtables</a> (and a slightly unfinished write-ahead log library) - for translating a set of operations on a hash into a series of append operations</li>
</ul>
<p>All this stuff is in one repository, since NKO participants were only given one repo. I'll split it apart and document it properly later on.</p>
<p><a href="https://github.com/nko3/foojs">Have a look at the repo on github</a> - I hope I will be able to make it public, otherwise it'll be on my github.</p>
<h2>kv - An open-source Dynamo-style distributed key-value store</h2>
<h2>Features</h2>
<ul class="list">
<li>Written at Node Knockout 2012 in 48h</li>
<li>Basically, a <a href="http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html">Dynamo</a> clone, but written from scratch and in JS.</li>
<li>Stores keys and values replicated across a group of undifferentiated servers.</li>
<li>Automatic partitioning based on a <a href="http://en.wikipedia.org/wiki/Distributed_hash_table">distributed hash table</a> using <a href="http://en.wikipedia.org/wiki/Consistent_hashing">consistent hashing</a> (with adjustable data placement)</li>
<li>Repartitioning when a node fails or is added</li>
<li>Replication: data is replicated to a sloppy quorum of peers; e.g. the values are <a href="http://www.allthingsdistributed.com/2007/12/eventually_consistent.html">eventually consistent</a></li>
<li>Latency and availability (e.g. <a href="http://web.mit.edu/6.033/2005/wwwdocs/quorum_note.html">the sizes of the read and write quorums</a>) are adjustable per-key</li>
<li>Conflict resolution on read: <a href="http://en.wikipedia.org/wiki/Vector_clock">vector clocks</a> are used to resolve inconsistent reads if possible</li>
</ul>
<h2>Differences from Dynamo</h2>
<ul class="list">
<li>Doesn't use a gossip protocol for propagating information about group membership changes and node failures. Instead, a strongly consistent coordination service modeled after Zookeeper is used (also written at NKO - I call it Nookeeper for now).</li>
<li>Merkle tree based anti-entropy is not implemented</li>
<li>Hinted handoff is not implemented</li>
</ul>
<h2>Installing</h2>
<p>Will be on npm soonish.
</p>
<h2>Starting the server</h2>
<p>Starting the server: you need at least 3 instances of the server running since reads and writes go to multiple servers. Each server can be started with:
</p>
<pre class="prettyprint">node server.js id [host]:[port] [list-of-all-servers]</pre>
<p>For example
</p>
<pre class="prettyprint">node server.js 1 localhost:8000 "1|localhost:8000,2|localhost:8001,3|localhost:8002"
node server.js 2 localhost:8001 "1|localhost:8000,2|localhost:8001,3|localhost:8002"
node server.js 3 localhost:8002 "1|localhost:8000,2|localhost:8001,3|localhost:8002"</pre>
<p>The list of servers will go away once a rendezvous mechanism is provided from the Nookeeper servers.
</p>
<p>You can run <code>node start.js</code> to do this - this has the benefit that all the servers are killed when you kill the launcher.
</p>
<h2>Writing and reading keys and values</h2>
<p>You can write and read keys from the command line:
</p>
<pre class="prettyprint">node client.js write foo bar 3 "1|localhost:8000,2|localhost:8001,3|localhost:8002"</pre>
<p>writes the value <code>bar</code> the key <code>foo</code> to a quorum of 3 servers.
</p>
<p>To read:
</p>
<pre class="prettyprint">node client.js read foo 2 "1|localhost:8000,2|localhost:8001,3|localhost:8002"</pre>
<p>reads the key <code>foo</code> from a quorum of 2 servers.
</p>
<p>There is also a Node API, read the tests for some examples - better documentation coming later.
</p>
<h2>nookeeper - An open-source Zookeeper-style distributed coordination system with strong consistency guarantees</h2>
<h2>Features</h2>
<ul class="list">
<li>a centralized service for: maintaining configuration information, naming, providing distributed synchronization, and providing group services such as group membership and failure detection</li>
<li>simple, filesystem-like API based on files and directories (basically stolen from Zookeeper)</li>
<li>data is kept in memory (for fast operations) and also persisted into a write-ahead log</li>
<li>writes are replicated in a strongly consistent manner using the ZAB, the Zookeeper atomic broadcast protocol; a protocol similar to Paxos</li>
<li>sequential consistency: updates from a client will be applied in the order that they were sent</li>
<li>atomicity: updates either succeed or fail. No partial results.</li>
<li>reliability: once an update has been applied, it will persist from that time forward until a client overwrites it.</li>
<li>fast reads: reads occur on one server (with watches to notify of subsequent changes if necessary)</li>
</ul>
<h2>KV vs nookeeper</h2>
<p>I wrote two systems during the Node Knockout hackathon, here are the differences:
</p>
<ul class="list">
<li>nookeeper is strongly consistent; kv is eventually consistent</li>
<li>kv allows you to specify quorum sizes since inconsistent reads are allowed and reconciled using vector clocks. nookeeper always uses always uses a full quorum.</li>
<li>in a partition, nookeeper chooses consistency; kv chooses availability</li>
<li>kv can survive the loss of all server nodes up to the minimum quorum size for each operation (as long as the values have time to be replicated) because it uses sloppy quorums. nookeeper survives the loss of f nodes, where N > 2f, e.g. with N = 3 servers, f = 1; with N = 5, f = 2 and so on.</li>
</ul>
<h2>Installing</h2>
<p>Will be on npm soonish.
</p>
<h2>Starting the server</h2>
<p>Starting the server: you need at least 3 instances of the server running since reads and writes go to multiple servers. Each server can be started with:
</p>
<pre class="prettyprint">node server.js id [host]:[port] [list-of-all-servers]</pre>
<p>For example
</p>
<pre class="prettyprint">node server.js 1 localhost:9000 "1|localhost:9000,2|localhost:9001,3|localhost:9002"
node server.js 2 localhost:9001 "1|localhost:9000,2|localhost:9001,3|localhost:9002"
node server.js 3 localhost:9002 "1|localhost:9000,2|localhost:9001,3|localhost:9002"</pre>
<p>You can run <code>node start.js</code> to do this - this has the benefit that all the servers are killed when you kill the launcher.
</p>
<h2>Client API</h2>
<p>Nookeeper is based around persistent connections (for example, ephemeral nodes would not work since they are indicators for a connected client), so the client uses a REPL:
</p>
<pre class="prettyprint">node client.js localhost:9000</pre>
<p>REPL commands / client API:
</p>
<ul class="list">
<li><code>client.create(path, value, flags, callback)</code>: creates a node in the filesystem. The callback is <code>(err, name)</code> where name is the full name of the new node.</li>
<li><code>client.setData(path, value, version, callback)</code>: updates a node in the filesystem. The callback is <code>(err)</code>.</li>
<li><code>client.getData(path, watcher, callback)</code>: gets the data stored at a path. The watcher is a callback with no parameters, and gets triggered when a change occurs at the given path. The callback is <code>(err, data, stat)</code>, where data is the content, and stat is metadata about the node.</li>
<li><code>client.exists(path, watcher, callback)</code>: callback is called with <code>true</code> if the node exists and <code>false</code> if it does not.</li>
<li><code>client.remove(path, version, callback)</code>: removes the node at the path. The callback is <code>(err)</code>.</li>
</ul>
<!-- end content -->
</div>
</div>
<div class="clear">
</div>
<div id="footer">
</div>
</div>
</body>
</html>