forked from alinush/6.824-lecture-notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
quiz1.html
466 lines (296 loc) · 16.6 KB
/
quiz1.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
<h1>Q1 2014</h1>
<h2>1. MapReduce</h2>
<h3><a href="qs/q1-2014/q14-1-1.png">Question 1</a></h3>
<p><strong>Answer:</strong> The Map jobs are ran in parallel and once they are done, the reduce jobs are ran in parallel as well.</p>
<p>You can imagine implementing the <code>wc</code> example by having the <code>Map</code> calls doing <code>Put()</code> and <code>Get()</code> calls to store and increment the counts for the words. <code>Increment(w)</code> would be <code>c = Get(w); Put(w, c+1)</code>. However, this will be bound to have race conditions when two <code>Map()</code> calls increment the count for the same word <code>w</code>:</p>
<pre><code>Map1: c1 = Get(w)
Map2: c1 = Get(w)
// note that they both get the same count
Map1: Put(w, c1+1)
Map2: Put(w, c1+1)
</code></pre>
<p>If the count were <code>c</code> before the two calls executed, then it will be <code>c+1</code> instead of <code>c+2</code> after they finished.</p>
<h2>2. Non-determ. and repl. state mach.</h2>
<h3><a href="qs/q1-2014/q14-2-2.png">Question 2</a></h3>
<p><strong>Answer:</strong> In my lab 3, whenever agreement is reached for a certain paxos instance, a Decided message is broadcast to everyone and waited upon for receipt confirmation by everyone. The timestamp can be included in this message. When to generate it? Before a Prepare call. If the proposer's value was the accepted value, then his timestamp is used. If it was another's proposer's value, then the timestamp included with that value should be used.</p>
<p>The problem with this is that you could have a Put(a, v2) decide at log entry i+1, before a Put(a, v1) had a chance to decide at log entry i, so then a later Put() would have an earlier timestamp than an earlier Put(), which would be incorrect</p>
<p>A different approach would be to start two agreements for a Put(), one with a timestamp and one for the actual Put(). The problem with this approach is the interleaving of these agreements for two Put() requests will mess things up?</p>
<p>It seems that the only way to get this to work is to make sure that everything is decided up to seq. i before you propose and accept the Put() at seq i+1, then you can use the Decided timestamping approach. </p>
<h2>Not quite replicated state machines</h2>
<h3><a href="qs/q1-2014/q14-3-3.pdf">Question 3</a></h3>
<p>Q: What does <code>PutHash(k, v)</code> do?
A: returns old <code>db[k]</code> and sets new <code>db[k] = hash(db[k] + v)</code>. Can be used to chain all values for a key together. Useful when testing apparently</p>
<p>Example:</p>
<pre><code>`Put(a, 'lol')`
`PutHash(a, 'sd') -> lol`
`Get(a)` -> hash('lol' + 'sd')
</code></pre>
<p>Answer:</p>
<pre><code> Primary Backup
c1 - PutHash(a,v1) -> |S1 |S2
| |db[a]=v0 |db[a]=v0
| | -- fwd op --> |
| | | execs op =>
| | |db[a]=h(v0|v1)
| | |dups[reqid]=v0
| | x-- op reply fails -- |
| | due to net fail |
| | |
\ |S1 |S2
| |db[a]=v0 |db[a]=h(v0|v1)
c2 - PutHash(a,v2) -> | |
| | -- fwd op --> | execs op =>
| | |db[a]=h(h(v0|v1)|v2)
| | |
| | <-- reply goes back to S1 |
| | and S1 replies to c2 |
| | |
/ |db[a]=h(h(v0|v1)|v2) |
|----------> |S1 |S2
| | fails!!
| no duplicate info at |
| S1 about c1's requst |
| so it'll be reeexecuted |
| |
| |
</code></pre>
<h2>Flat datacenter storage</h2>
<h3><a href="qs/q1-2014/q14-4-4.png">Question 4</a></h3>
<p><strong>Answer:</strong> This is silly: If Ben's design returns to the client after hearing from JUST one server, then the blob's size is NOT REALLY extended: i.e. there are still servers that don't know about that new size. So when the client will contact them with a write to tract n-1 just after ExtendBlobSize(n), a bunch of these writes will fail because the servers will return an out-of-bounds error.</p>
<p>Literally, ANY application that calls extend and then write will fail.</p>
<p><strong>Their answer:</strong> If two separate clients try to extend the size of the blob by 1 tract (paper doesn't make it clear if you specify new size or additional size), and they talk to different servers and both reply "Extended successfully" back, then when the two clients will write this newly added tract, they will overwrite each other's writes, instead of writing separately to tract <code>n+1</code> and <code>n+2</code> (assuming <code>n</code> was the size of the blob before extending).</p>
<h3><a href="qs/q1-2014/q14-4-5.png">Question 5</a></h3>
<p><strong>Answer:</strong> Seems like it must reply ABORT to a PREPARE request <code>r1</code> when there's another request <code>r0</code> that has successfully prepared.</p>
<h2>Paxos</h2>
<h3><a href="qs/q1-2014/q14-5-6.png">Question 6</a></h3>
<p><strong>Answer:</strong> This is wrong, because acceptors could accept a value that's never been proposed.</p>
<p>Diagram:</p>
<pre><code>px -- means server received prepare(n=1)
axvy -- means server received accept(n=x, v=y)
dx -- means server received decided(v=x)
S1: starts proposal loop w/ n = 1
S1: p1 | offiline for a while ...
S2: p1 | reboot => na = np = 1, va=nil | p2 | a1v<nil> | d<nil>
S3: p1 | reboot => na = np = 1, va=nil | p2 | a1v<nil> | d<nil>
</code></pre>
<p><strong>Their answer:</strong></p>
<pre><code>S1: p1 p2 a2v2
S2: p1 a1v1 p3 (np=3) reboot, restore(np=na=3) p4 (S2 replies w/ na=3,v=1) a4v1
S3: p1 p2 a2v2 p4 (S3 replies w/ na=2,v=2) a4v1
</code></pre>
<h3><a href="qs/q1-2014/q14-5-7.png">Question 7</a></h3>
<p><strong>Answer:</strong> Not linearizable because:</p>
<pre><code>C1 sends a Get() to S1
S1 starts Paxos for that Get() at seq 0
C2 sends a Put() to S2
S1 starts Paxos for that Put(a, "1") at seq 1
S1 gets consensus for the Put(a, "1") at seq 1
S1 returns success to to C2
C2 issues another Put(a, "2"), but it goes to S2
S2 starts Paxos for that Put(a, "2") at seq 0
S2 gets agreement for that Put(a, "2") at seq 0
it won over C1/S1's Get() at seq 0
Now the earlier Put() is at seq 1 and the later Put() is at seq 0
=> not linearizable
</code></pre>
<h3><a href="qs/q1-2014/q14-5-8.png">Question 8</a></h3>
<p><strong>Answer:</strong> No EPaxos this semester.</p>
<h2>Spanner</h2>
<h3><a href="qs/q1-2014/q14-6-9.png">Question 9</a></h3>
<p><strong>Answer:</strong> TODO for Quiz 2</p>
<h2>Harp</h2>
<h3><a href="qs/q1-2014/q14-7-10.png">Question 10</a></h3>
<p><strong>Answer:</strong></p>
<pre><code>A. True
B. False
C. True
D. True
E. False
</code></pre>
<h1>Q2 2014</h1>
<h2>Memory models</h2>
<h3>Question 1</h3>
<p><strong>Answer:</strong> world</p>
<h3>Question 2</h3>
<p><strong>Answer</strong>: </p>
<ul>
<li>hello</li>
<li>world</li>
<li>never ending loop because read to done doesn't have to observe the write in the thread</li>
</ul>
<h3>Question 3</h3>
<p><strong>Answer</strong>: Go's memory model is more similar to release consistency, whereas Ivy's is sequential consistency => if you don't use locks/channels in Go to synchronize reads/writes, then funny things will happen</p>
<h3>Question 4</h3>
<p><strong>Answer:</strong>
In TreadMarks, the program would loop infinitely, because there's no acquire on a lock for the <code>done</code> variable => no one can inform the <code>done</code> reader than done has been set to true.</p>
<h3>Question 5</h3>
<p><strong>Answer:</strong>
Adding a lock around the write and the read to <code>done</code> is enough. Or not clear... does causal consistency give you previous writes that did not contribute to the write to <code>done</code>? If it does not, then you also need to include the write to <code>a</code> under the lock.
For efficiency, something like a semaphore can be used to make sure the write happens before the read.</p>
<h2>Ficus</h2>
<h3>Question 6</h3>
<p><strong>Answer:</strong>
H4 will have the latest VT [1,1,0,0] so it will print 2 for both cat's</p>
<h3>Question 7</h3>
<p><strong>Answer:</strong>
[1,1,0,0]</p>
<h3>Question 8</h3>
<p><strong>Answer:</strong>
[1,0,0,0]</p>
<h3>Question 9</h3>
<p><strong>Answer:</strong></p>
<pre><code>h1: f=1, [1,0,0,0] ->h2 f=3 [2,0,0,0] ->h4
h2: [1,0,0,0] f=2 [1,1,0,0] ->h4
h3:
h4: [1,1,0,0] cat f [prints 2] conflict on f b.c. [2,0,0,0] and [1,1,0,0]
</code></pre>
<h3>Question 10 through 22</h3>
<p><strong>Answer:</strong> TODO: Next quiz!</p>
<h1>Q1 2013</h1>
<h2>Paxos</h2>
<h3>Question 1</h3>
<p>Why is decide phase in Paxos unnecessary?</p>
<p><strong>Answer:</strong></p>
<ul>
<li>Because any other Paxos peer can learn the agreed upon value by starting a proposal loop. He will Prepare(n) and get replies with na's and va's set that will contain the accepted value.</li>
<li>The agreement/consensus of paxos does NOT depend on the decided phase: it's a property of the whole system (i.e. have a majority of acceptors accepted the same value <code>v</code>?)
<ul>
<li>The agreement/consensus can be learned <em>without</em> doing a decided phase</li>
</ul></li>
</ul>
<h3>Question 2</h3>
<p><strong>Answer:</strong></p>
<pre><code>S1 starts proposal w/ n=1 and v=A
S2 starts proposal w/ n=2 and v=B
S1: p1 p2 rejected a1
S2: p1 p2 rejected a1
S3: p1 p2 rejected a1
S1 did not reach agreement after 1st round
S1 restarts proposal w/ n=3 and v=A
S1: p1 p2 rejected a1 | p3 rejected a2
S2: p1 p2 rejected a1 | p3 rejected a2
S3: p1 p2 rejected a1 | p3 rejected a2
S2 does not reach agreement after 1st round
</code></pre>
<h2>Flat datacenter storage</h2>
<h3>Question 3</h3>
<p><strong>Answer:</strong> If the failed tractserver is replaced with just ONE server from the empty pool that is strictly worse than replacing it with multiple already live servers, because it is faster to write to more servers at the same time than writing to a single server.</p>
<h2>Spanner</h2>
<h3>Question 4-6</h3>
<p>TODO: next quiz</p>
<h2>Distributed Shared Memory</h2>
<h3>Question 7</h3>
<p><strong>Answer:</strong> write diffs fixed the write amplification problem, where writing one byte on a page resulted in sending the whole page to a reader. Thus, since there's no such problem in byte-IVY (i.e. writing a byte means sending a byte to a reader), the answer is no.</p>
<h3>Question 8</h3>
<p><strong>Answer:</strong> LRC fixed the false sharing problem, where if two processes wrote different variables on the same page, that page would be bounced back and forth between the two processes, even though the two processes never needed to hear about each other's changes (they did not share those variables)</p>
<p>This question can be answered in two ways:</p>
<ol>
<li><p>If you reasonable assume that the smallest variable is a byte (which is true on today's computers), then when two processes write the same page they are writing the same variable and the sharing is <em>true</em>. There can't be any false sharing by "construction" so to speak. Thus, LRC is not needed.</p></li>
<li><p>If you consider variables that are smaller than a byte (which is cuckoo), then maybe there's an argument for how LRC can help with false sharing within a 1-byte page with 2 4-bit variables or with 4 2-bit variables, etc.</p></li>
</ol>
<h2>CBCAST</h2>
<p>Scenario:</p>
<pre><code>sX - sends X to all
rX - receives message X, does not deliver to app yet
dX - delivers X to app
M1: sX sZ
M2: rX dY sY
M3:
</code></pre>
<h3>Question 9</h3>
<pre><code>X -- [1,0,0]
Y -- [1,1,0]
Z -- [2,0,0] if M1 didn't get Y yet
[2,0,1] if M1 got Y
</code></pre>
<h3>Question 10</h3>
<p>Can M3 get msgs in the following orders?</p>
<pre><code>X Y Z - yes [0,0,0] -> [1,0,0] -> [1,1,0] -> [2,1,0]
X Z Y - yes [0,0,0] -> [1,0,0] -> [2,0,0] -> [2,1,0]
Y X Z - no [0,0,0] cannot go to [1,1,0] must wait for X
Y Z X - no same as before
Z X Y - nope
Z Y X - nope
</code></pre>
<h2>Lab 2</h2>
<pre><code>P:S0 B:S1
S1 has copy of S0 state
S0 and S1 have been processing requests
...
S0 fwds req to S1, but S0's rpclib returns an error indicating no reply from S1
S0 did not hear of any view change from viewserver
</code></pre>
<h3>Question 11</h3>
<p>Can S0 execute the request and return a success to client?</p>
<p><strong>Answer:</strong> That could be a recipe for disaster, because S1 could have failed and the viewserver could be in the middle of realizing this. If S0 executes the op and replies and if S1 really went down and did not have a chance to replicate, then we could create an inconsistent state. If S1 comes back quickly enough and the view remains the same, then S1 missed this op. Once S1 becomes primary, it will not have the op => inconsistency.</p>
<h3>Question 12</h3>
<p><strong>Answer:</strong> According to our lab specs, no. The client calls are NOT allowed to fail, so if S0 tells the client 'sorry boss, no can't do...' then the client will have to shrug its shoulders and pretend like the op succeeded: the spec doesn't allow it to fail.</p>
<p>In a realistic world, the client can return an error code to the caller and the caller can retry the client call. So S0 could return a failure to the client, like a timeout.</p>
<p><strong>Their answer:</strong> Their angle was that the backup could have executed the primary's replicate op, but the reply got lost. And so if the primary tells the client it has failed, that would be incorrect. And now the primary and the backup are not replicas anymore. I guess I didn't even look that far because I assumed informing the client of a failure would be unacceptable since he won't know how to handle it. However, the client could just retry again.</p>
<h3>Question 13</h3>
<p><strong>Answer:</strong> </p>
<pre><code>C1 sends op to S1
S1 starts agreement for the op at seq# i
S1 replies to C1 when it gets agreement for op at seq #i
Reply gets lost
C1 retries, sends op to S2
S2 starts agreement for the op at seq #j, j != i (because maybe S2 did not take part in S1's agreement at seq# i, maybe it was partitioned)
S2 gets agreement
S2 replies to C1
C1 gets reply
...
Now servers have the same op twice in the log at seq #i and #j
If this is an Append() op, then we create an inconsistent state
</code></pre>
<h1>Q2 2013</h1>
<h2>Lab 4</h2>
<h3>Question 1</h3>
<p>TODO for quiz 2</p>
<h2>Bayou</h2>
<h3>Question 2</h3>
<p>TODO for quiz 2</p>
<h3>Question 3</h3>
<p>TODO for quiz 2</p>
<h3>Question 4</h3>
<p>TODO for quiz 2</p>
<h3>Question</h3>
<h2>Dynamo</h2>
<h3>Question 5</h3>
<p>TODO for quiz 2</p>
<h2>Two phase commit or Paxos</h2>
<h3>Question 6</h3>
<p>TODO for quiz 2</p>
<h2>Argus</h2>
<h3>Question 7</h3>
<p>TODO for quiz 2</p>
<h2>MapReduce</h2>
<h3>Question 8</h3>
<p>TODO for quiz 2</p>
<h3>Question 9</h3>
<p>TODO for quiz 2</p>
<h2>Bitcoin</h2>
<h3>Question 10</h3>
<p>TODO for quiz 2</p>
<h2>Memcached at Facebook</h2>
<h3>Question 11</h3>
<p>TODO for quiz 2 </p>
<h3>Question 12</h3>
<p>TODO for quiz 2</p>
<h1>Q1 2012</h1>
<h2>Plan 9</h2>
<p>N/A</p>
<h2>Dinner</h2>
<p><strong>Answer:</strong> Both computers' client code lock <code>m</code> and the server code cannot respond because it cannot acquire the lock</p>
<h1>Q2 2012</h1>
<h2>Harp</h2>
<h2>Paxos</h2>
<h3>Question 9</h3>
<p><strong>Answer:</strong></p>
<pre><code>S1 proposes w/ n = 1, v = A
S2 proposes w/ n = 2, v = B
S3 proposes w/ n = 3, v = C
S1: p1 a1vA p3 a3vA
S2: p2 a2vB p3 a3vA
S3: p1 p2 a2vB p3 a3vA
consensus, but on different values
</code></pre>