lab3/notes.html

<h1>Lab 3: Part A</h1>

<h2>Details</h2>

<ul>
<li>An application calls <code>Make(peers,me)</code> to create a Paxos peer.
<ul>
<li>The <code>peers</code> argument contains the ports of all the peers (including this one), </li>
<li>The <code>me</code> argument is the index of this peer in the peers array. </li>
</ul></li>
<li><code>Start(seq,v)</code> asks Paxos to start agreement on instance seq, with proposed value v; 
<ul>
<li><code>Start()</code> should return immediately, without waiting for agreement to complete. </li>
<li>The application calls <code>Status(seq)</code> to find out whether the Paxos peer thinks the instance 
has reached agreement, 
<ul>
<li>and if so what the agreed value is. </li>
</ul></li>
</ul></li>
<li><code>Status()</code> should consult the local Paxos peer's state and return immediately; 
<ul>
<li>it should not communicate with other peers. </li>
<li>the application may call <code>Status()</code> for old instances (but see the discussion of <code>Done()</code> below).</li>
</ul></li>
<li>implementation should be able to <strong>make progress on agreement for multiple instances at the same time</strong>. 
<ul>
<li>if application peers call <code>Start()</code> with different sequence numbers at about the same time, 
your implementation should run the Paxos protocol concurrently for all of them. </li>
<li>you should not wait for agreement to complete for instance <code>i</code> before starting the protocol 
for instance <code>i+1</code>. Each instance should have its own separate execution of the Paxos protocol.</li>
</ul></li>
<li>long-running Paxos-based server must forget about instances that are no longer needed, 
<ul>
<li>free the memory storing information about those instances</li>
<li>an instance is needed if the application still wants to be able to call <code>Status()</code> for it
<ul>
<li>or if another Paxos peer may not yet have reached agreement on that instance</li>
</ul></li>
<li>when a particular peer application will no longer need to call <code>Status()</code> for any instance 
<code>&lt;= x</code>, it should call <code>Done(x)</code>. 
<ul>
<li>that Paxos peer can't yet discard the instances, b/c. some other Paxos peer might not yet 
have agreed to the instance. </li>
<li>so each Paxos peer should tell each other peer the highest <code>Done()</code> argument supplied by 
its local application. </li>
<li>each Paxos peer will then have a <code>Done()</code> value from each other peer. </li>
<li>it should find the minimum, and discard all instances with sequence numbers <code>&lt;= that
minimum</code>. </li>
<li>The <code>Min()</code> method returns this minimum sequence number plus one.</li>
<li>it's OK for your Paxos to piggyback the <code>Done()</code> value in the agreement protocol packets; </li>
<li>that is, it's OK for peer P1 to only learn P2's latest <code>Done()</code> value the next time that 
P2 sends an agreement message to P1. </li>
<li>If <code>Start()</code> is called with a sequence number less than <code>Min()</code>, the <code>Start()</code> call should 
be ignored. </li>
<li>If <code>Status()</code> is called with a sequence number less than <code>Min()</code>, <code>Status()</code>
should return <code>Forgotten</code>.</li>
</ul></li>
</ul></li>
</ul>

<h2>Plan</h2>

<p>Here's a reasonable plan of attack:</p>

<ol>
<li>Add elements to the Paxos struct in <code>paxos.go</code> to hold the state you'll need, according 
to the lecture pseudo-code. 
<ul>
<li>You'll need to define a struct to hold information about each agreement instance.</li>
</ul></li>
<li>Define RPC argument/reply type(s) for Paxos protocol messages, based on the lecture pseudo-code.
<ul>
<li>The RPCs must include the sequence number for the agreement instance to which they refer.</li>
<li>Remember the field names in the RPC structures must start with capital letters.</li>
</ul></li>
<li>Write a proposer function that drives the Paxos protocol for an instance, and RPC handlers that
implement acceptors. 
<ul>
<li>Start a proposer function in its own thread for each instance, as needed (e.g. in <code>Start()</code>).</li>
</ul></li>
<li>At this point you should be able to pass the first few tests.</li>
<li>Now implement forgetting.</li>
</ol>

<h2>Hints</h2>

<p><strong>Done:</strong> more than one Paxos instance may be executing at a given time, and they may be Start()ed and/or decided out of order (e.g. seq 10 may be decided before seq 5).</p>

<ul>
<li>before acting on log entry #<code>i</code>, wait for log entries <code>&lt; i</code> to be decided</li>
</ul>

<p><strong>Done:</strong> in order to pass tests assuming unreliable network, your paxos should call the local acceptor through a function call rather than RPC.</p>

<p><strong>Hint:</strong> remember that multiple application peers may call Start() on the same instance, perhaps with different proposed values. An application may even call Start() for an instance that has already been decided (maybe because it could race when issuing a NoOp for a hole in the log).</p>

<p><strong>Hint:</strong> think about how your paxos will forget (discard) information about old instances before you start writing code. Each Paxos peer will need to store instance information in some data structure that allows individual instance records to be deleted (so that the Go garbage collector can free / re-use the memory).</p>

<p><strong>Hint:</strong> you do not need to write code to handle the situation where a Paxos peer needs to re-start after a crash. If one of your Paxos peers crashes, it will never be re-started.</p>

<p><strong>Hint:</strong> have each Paxos peer start a thread per un-decided instance whose job is to eventually drive the instance to agreement, by acting as a proposer.</p>

<p><strong>Hint:</strong> a single Paxos peer may be acting simultaneously as acceptor and proposer for the same instance. Keep these two activities as separate as possible.</p>

<p><strong>Hint:</strong> a proposer needs a way to choose a higher proposal number than any seen so far. This is a reasonable exception to the rule that proposer and acceptor should be separate. It may also be useful for the propose RPC handler to return the highest known proposal number if it rejects an RPC, to help the caller pick a higher one next time. The px.me value will be different in each Paxos peer, so you can use px.me to help ensure that proposal numbers are unique.</p>

<p><strong>Hint:</strong> figure out the minimum number of messages Paxos should use when reaching agreement in non-failure cases and make your implementation use that minimum.</p>

<p><strong>Hint:</strong> the tester calls Kill() when it wants your Paxos to shut down; Kill() sets px.dead. You should check px.dead in any loops you have that might run for a while, and break out of the loop if px.dead is true. It's particularly important to do this in any long-running threads you create.</p>

<h1>Lab 3: Part B</h1>

<h2>Notes</h2>

<ul>
<li>kvpaxos replicas should stay identical; 
<ul>
<li>the only exception is that some replicas may lag others if they are 
not reachable. </li>
</ul></li>
<li>if replica isn't reachable for a while, but then starts being reachable <code>=&gt;</code>
should eventually catch up (learn about operations that it missed)</li>
<li>kvpaxos client code <strong>should try different replicas it knows about</strong> until one responds</li>
<li><strong>kvpaxos replica that is part of a majority</strong> which can reach each other 
<em>should be able to serve client requests</em></li>
<li><strong>provide sequential consistency</strong> to applications that use its client interface. 
<ul>
<li>completed application calls to the <code>Clerk.Get()</code>, <code>Clerk.Put()</code>, and <code>Clerk.Append()</code> 
methods in <code>kvpaxos/client.go</code> must appear to have affected all replicas in the same 
order and have <strong>at-most-once semantics</strong></li>
<li>A <code>Clerk.Get()</code> should see the value written by the most recent <code>Clerk.Put()</code> or 
<code>Clerk.Append()</code> (in that order) to the same key. </li>
<li>One consequence of this is that
you must ensure that each application call to <code>Clerk.Put()</code> or <code>Clerk.Append()</code> 
must appear in that order just once (i.e., write the key/value database just once), 
even though internally your <code>client.go</code> may have to send RPCs multiple times until 
it finds a kvpaxos server replica that replies.</li>
</ul></li>
</ul>

<h2>Plan</h2>

<p>Here's a reasonable plan:</p>

<ol>
<li>Fill in the <code>Op</code> struct in <code>server.go</code> with the "value" information that kvpaxos will 
use Paxos to agree on, for each client request. 
<ul>
<li><code>Op</code> field names must start with capital letters. </li>
<li>You should use <code>Op</code> structs as the agreed-on values 
<ul>
<li>for example, you should pass <code>Op</code> structs to <code>Paxos.Start()</code>.</li>
<li>Go's RPC can marshall/unmarshall <code>Op</code> structs</li>
<li>the call to <code>gob.Register()</code> in <code>StartServer()</code> teaches it how.</li>
</ul></li>
</ul></li>
<li>Implement the <code>PutAppend()</code> handler in server.go.
<ul>
<li>it should enter a <code>Put</code> or <code>Append</code> <code>Op</code> in the Paxos log (i.e., use Paxos to allocate a 
Paxos instance, whose value includes the key and value (so that other kvpaxoses know 
about the <code>Put()</code> or <code>Append()</code>)). </li>
<li>An <code>Append</code> Paxos log entry should contain the <code>Append</code>'s 
arguments, but not the resulting value, since the result might be large.</li>
</ul></li>
<li>Implement a <code>Get()</code> handler. 
<ul>
<li>It should enter a <code>Get</code> <code>Op</code> in the Paxos log, and then "interpret" 
the log before that point to make sure its key/value database reflects all recent <code>Put()</code>s.</li>
</ul></li>
<li>Add code to cope with duplicate client requests
<ul>
<li>including situations where:
<ul>
<li>the client sends a request to one kvpaxos replica, </li>
<li>client times out waiting for a reply, </li>
<li>client re-sends the request to a different replica. </li>
</ul></li>
<li>the client request should execute just once. </li>
<li>make sure that your scheme for duplicate detection <strong>frees server memory quickly</strong> 
<ul>
<li>for example, by having the client tell the servers which RPCs it has heard a reply for</li>
<li>it's OK to piggyback this information on the next client request</li>
</ul></li>
</ul></li>
</ol>

<h2>Hints</h2>

<p><strong>Hint:</strong> your server should try to assign the next available Paxos instance (sequence number) to each incoming client RPC. However, some other kvpaxos replica may also be trying to use that instance for a different client's operation. So the kvpaxos server has to be prepared to try different instances.</p>

<p><strong>Hint:</strong> your kvpaxos servers should not directly communicate; they should only interact with each other through the Paxos log.</p>

<p><strong>Hint:</strong> as in Lab 2, you will need to uniquely identify client operations to ensure that they execute just once. Also as in Lab 2, you can assume that each clerk has only one outstanding <code>Put</code>,<code>Get</code>, or <code>Append</code>.</p>

<p><strong>Hint:</strong> a kvpaxos server should not complete a <code>Get()</code> RPC if it is not part of a majority (so that it does not serve stale data). This means that each <code>Get()</code> (as well as each <code>Put()</code> and <code>Append()</code>) must involve Paxos agreement.</p>

<p><strong>Hint:</strong> don't forget to call the Paxos <code>Done()</code> method when a kvpaxos has processed an instance and will no longer need it or any previous instance.</p>

<p><strong>Hint:</strong> your code will need to wait for Paxos instances to complete agreement. The only way to do this is to periodically call <code>Status()</code>, sleeping between calls. How long to sleep? A good plan is to check quickly at first, and then more slowly:</p>

<pre><code>  to := 10 * time.Millisecond
  for {
    status, _ := kv.px.Status(seq)
    if status == paxos.Decided {
      ...
      return 
    }
    time.Sleep(to)
    if to &lt; 10 * time.Second {
      to *= 2
    }
  }
</code></pre>

<p><strong>Hint:</strong> if one of your kvpaxos servers falls behind (i.e. did not participate in the agreement for some instance), it will later need to find out what (if anything) was agreed to. A reasonable way to to this is to call <code>Start()</code>, which will either discover the previously agreed-to value, or cause agreement to happen. Think about what value would be reasonable to pass to <code>Start()</code> in this situation.</p>

<ul>
<li>So we would need a <code>NoOp</code> for this, In case the fallen-behind node who wanted to discover previously agreed-to value actually succeeds in proposing one</li>
</ul>

<p><strong>[DONE] Hint:</strong> When the test fails, check for gob error (e.g. "rpc: writing response: gob: type not registered for interface ...") in the log because go doesn't consider the error fatal, although it is fatal for the lab.</p>

<h2>Algorithm</h2>

<p>Each KVP server has a copy of the DB and of the log</p>

<p><strong>Note:</strong> We will <em>not</em> reimplement a log, because we should not have to! 
The log is already implicitly implemented in the <code>paxos</code> library from Part A. And a server will know how much of the log it can apply to its local DB by calling <code>Min()</code> on its Paxos peer.</p>

<p><strong>Q:</strong> What's the high-level algorithm <br />
<strong>A:</strong> A sketch:</p>

<ul>
<li>Client sends a request to one of the Paxos servers</li>
<li>Each server has a <code>nextSeq</code> number that it <em>thinks</em> the next request should get
<ul>
<li>it's possible that servers are out of date on this number due to partitioning, delays, etc.
<ul>
<li>S1 and S2 both start with seq #0</li>
<li>S1 gets request from C1</li>
<li>S2 gets request from C2</li>
<li>S1 and S2 will both propose the same seq. number</li>
</ul></li>
</ul></li>
<li>Server <code>S</code> creates an <code>Op</code> <code>op</code> from the request and calls <code>Paxos.Start(nextSeq, op)</code> to get other
peers to agree on this op
<ul>
<li>problem is they might not agree because they are also trying to get their own <code>Op</code> to be agreed
upon</li>
</ul></li>
<li>If all (majority) agree, that's fine and dandy
<ul>
<li><strong>TODO:</strong> Note that once a majority agrees there could still be a minority
that needs to find out. How will they do it?
<ul>
<li>The proposer <code>S</code> who got the majority to agree will not finish his <code>Start()</code> loop
until everyone receives the <code>Decided</code> message, so it seems that we do not need
to do anything?</li>
<li>But what if that proposer <code>S</code> is down and there is still a majority up that can
inform the minority about the agreement?
<ul>
<li>seems like the minority would need to ask about the status</li>
</ul></li>
</ul></li>
</ul></li>
<li>If they do not agree with <code>S</code>'s value (i.e. <code>S</code>'s proposal failed), then <code>S</code> needs to
retry with a different <code>nextSeq</code> number.
<ul>
<li>at this point <code>S</code> can definitely increment his <code>nextSeq</code> number because he knows 
a value was agreed on </li>
<li>what should be <code>nextSeq</code> number that <code>S</code> picks if he fails proposing?
<ul>
<li>seems like it should just be the next one?</li>
</ul></li>
</ul></li>
</ul>

<p><strong>Q:</strong> Do we need to apply the log to the DB? <br />
<strong>A:</strong> Yes, from the memory test cases and from the Piazza answers.</p>

<p><strong>Q:</strong> When do we apply the log to the DB? <br />
<strong>A:</strong> Seems like we can call <code>Paxos.Min()</code> and see which log entries everyone agreed on <strong>AND everyone HAS</strong> and apply those to our DB?</p>

<ul>
<li>however we have to be careful to keep the Paxos inst./seq. number the same, even after applying those entries
just so we don't end up handling older Paxos agreements with the same seq. number</li>
</ul>

<p><strong>Q:</strong> Duplicate detection? <br />
<strong>A:</strong> Only one outstanding client call <code>=&gt;</code> next client call can refer to previous call's XID and tell the server to get rid of it?</p>