Leaders

Leaders use $Commander$ and $Scout$ subprocesses to run the Synod protocol.

Leader Invariants

As we shall see, the following invariants hold in the Synod protocol:

  • L1: For any ballot $b$ and slot $s$, at most one command $c$ is selected and at most one commander for $\langle b, s, c \rangle$ is spawned.
  • L2: Suppose that for each $\alpha$ among a majority of acceptors $\langle b, s, c \rangle \in \alpha.accepted$. If $b' > b$ and a commander is spawned for $\langle b', s, c' \rangle$, then $c = c'$.

Invariant L1 implies Invariant A4, because by L1 all acceptors that accept a pvalue for a particular ballot and slot number received the pvalue from the same commander. Similarly, Invariant L2 implies Invariant A5.

Commander

A leader may work on multiple slots at the same time. For each such slot, the leader selects a command and spawns a new process that we call a commander. While we present it as a separate process, the commander is really just a thread running within the leader. The commander runs what is known as $phase$ $2$ of the Synod protocol.

Below you can find the pseudo-code for a Commander:

$\texttt{process} ~ \textit{Commander}(\lambda, \textit{acceptors}, \textit{replicas}, \langle b, s, c \rangle)$
  $\texttt{var} ~ \textit{waitfor} := \textit{acceptors}$;

  $\forall \alpha \in \textit{acceptors}: \textit{send}(\alpha, \langle \textbf{p2a}, \textit{self}(), \langle b, s, c \rangle \rangle)$;
  $\texttt{for ever}$
    $\texttt{switch} ~ \textit{receive}()$
      $\texttt{case} ~ \langle \textbf{p2b}, \alpha, b' \rangle:$
        $\texttt{if} ~ b' = b ~ \texttt{then}$
          $\textit{waitfor} := \textit{waitfor} - \{ \alpha \}$;
          $\texttt{if} ~ |\textit{waitfor}| < |\textit{acceptors}| / 2 ~ \texttt{then}$
            $\forall \rho \in \textit{replicas}:$
              $\textit{send}(\rho, \langle \textbf{decision}, s, c \rangle$);
            $\textit{exit}()$;
          $\texttt{end if}$
        $\texttt{else}$
          $\textit{send}(\lambda, \langle \textbf{preempted}, b' \rangle)$;
          $\textit{exit}()$;
        $\texttt{end if}$
      $\texttt{end case}$
    $\texttt{end switch}$
  $\texttt{end for}$
$\texttt{end process}$

A commander sends a $\langle \textbf{p2a}, \lambda, \langle b, s, c \rangle \rangle$ message to all acceptors, and waits for responses of the form $\langle \textbf{p2b}, \alpha, b' \rangle$. In each such response $b' \ge b$ will hold. There are two cases:

  • If a commander receives $\langle \textbf{p2b}, \alpha, b \rangle$ from all acceptors in a majority of acceptors, then the commander learns that command $c$ has been chosen for slot $s$. In this case, the commander notifies all replicas and exits. To satisfy Invariant R1, we need to enforce that if a commander learns that $c$ is chosen for slot $s$, and another commander learns that $c'$ is chosen for the same slot $s$, then $c = c'$. This is a consequence of Invariant A5: if a majority of acceptors accept $\langle b, s, c \rangle$, then for any later ballot $b'$ and the same slot number $s$, acceptors can only accept $\langle b', s, c \rangle$. Thus if the commander of $\langle b', s, c' \rangle$ learns that $c'$ has been chosen for $s$, it is guaranteed that $c = c'$ and no inconsistency occurs, assuming---of course---that Invariant L2 holds.
  • If a commander receives $\langle \textbf{p2b}, \alpha', b' \rangle$ from some acceptor $\alpha'$, with $b' \ne b$, then it learns that a ballot $b'$, which must be larger than $b$ as guaranteed by acceptors, is active. This means that ballot $b$ may no longer be able to make progress, as there may no longer exist a majority of acceptors that can accept $\langle b, s, c \rangle$. In this case, the commander notifies its leader about the existence of $b'$, and exits.

Under the assumptions that at most a minority of acceptors can crash, that messages are delivered reliably, and that the commander does not crash, the commander will eventually do one or the other.

The leader must enforce Invariants L1 and L2. Because there is only one leader per ballot, Invariant L1 is trivial to enforce by the leader not spawning more than one commander per ballot number and slot number. To enforce Invariant L2, the leader runs what is often called $phase$ $1$ of the Synod protocol or a $view$ $change$ protocol for some ballot before spawning commanders for that ballot. The leader spawns a $scout$ thread to run the view change protocol for some ballot $b$. A leader starts at most one of these for any ballot $b$, and only for its own ballots.

Scout

Below you can find the pseudo-code for a scout. The code is similar to that of a commander, except that it sends and receives phase 1 messages instead of phase 2 messages.

$\texttt{process} ~ \textit{Scout}(\lambda, \textit{acceptors}, b)$
  $\texttt{var} ~ \textit{waitfor} := \textit{acceptors}, ~ pvalues := \emptyset$;

  $\forall \alpha \in \textit{acceptors}: \textit{send}(\alpha,\langle \textbf{p1a}, self(), b \rangle)$;
  $\texttt{for ever}$
    $\texttt{switch} ~ \textit{receive}()$
      $\texttt{case} ~ \langle \textbf{p1b}, \alpha, b', r \rangle:$
        $\texttt{if} ~ b' = b ~ \texttt{then}$
          $pvalues := pvalues \cup r$;
          $\textit{waitfor} := \textit{waitfor} - \{ \alpha \}$;
          $\texttt{if} ~ |\textit{waitfor}| < |\textit{acceptors}| / 2 ~ \texttt{then}$
            $\textit{send}(\lambda, \langle \textbf{adopted}, b, pvalues \rangle)$;
            $\textit{exit}()$;
          $\texttt{end if}$
        $\texttt{else}$
          $\textit{send}(\lambda, \langle \textbf{preempted}, b' \rangle)$;
          $\textit{exit}()$;
        $\texttt{end if}$
      $\texttt{end case}$
    $\texttt{end switch}$
  $\texttt{end for}$
$\texttt{end process}$

A scout completes successfully when it has collected $\langle \textbf{p1b}, \alpha, b, r_\alpha \rangle$ messages from all acceptors in a majority, and returns an $\langle \textbf{adopted}, b, \bigcup r_\alpha \rangle$ message to its leader $\lambda$. As we will see later, the leader uses $\bigcup r_\alpha$, the union of all pvalues accepted by this majority of acceptors, in order to enforce Invariant L2.

Leader State

Leader $\lambda$ maintains three state variables:

  • $\lambda.ballot\_num$: a monotonically increasing ballot number, initially $(0, \lambda)$.
  • $\lambda.\textit{active}$: a boolean flag, initially $\texttt{false}$.
  • $\lambda.\textit{proposals}$: a map of slot numbers to proposed commands in the form of a set of $\langle \textit{slot number}, \textit{command} \,\rangle$ pairs, initially empty. At any time, there is at most one entry per slot number in the set.
Leader Operational Description

Below you can find the pseudo-code for a Leader:

$\texttt{process} ~ \textit{Leader}(\textit{acceptors}, \textit{replicas})$
  $\texttt{var} ~ ballot\_num = (0, self()), \textit{active} = \texttt{false},\textit{proposals} = \emptyset$;

  $\textit{spawn}(\textit{Scout}(self(), \textit{acceptors}, ballot\_num))$;
  $\texttt{for ever}$
    $\texttt{switch} ~ \textit{receive}()$
      $\texttt{case} ~ \langle \textbf{propose}, s, c \rangle:$
        $\texttt{if} ~\not\exists c' : \langle s, c' \rangle \in \textit{proposals} ~ \texttt{then}$
          $\textit{proposals} := \textit{proposals} \cup \{ \langle s, c \rangle \}$;
          $\texttt{if} ~ \textit{active} ~ \texttt{then}$
            $\textit{spawn}(\textit{Commander}(self(), \textit{acceptors}, \textit{replicas}, \langle ballot\_num, s, c \rangle)$;
          $\texttt{end if}$
        $\texttt{end if}$
      $\texttt{end case}$
      $\texttt{case} ~ \langle \textbf{adopted}, ballot\_num, \textit{pvals} \rangle:$
        $\textit{proposals} := \textit{proposals} \lhd \textit{pmax}(\textit{pvals})$;
        $\forall \langle s, c \rangle \in \textit{proposals}:$
          $\textit{spawn}(\textit{Commander}(self(), \textit{acceptors}, \textit{replicas}, \langle ballot\_num, s, c \rangle)$;
        $\textit{active} := \texttt{true}$;
      $\texttt{end case}$
      $\texttt{case} ~ \langle \textbf{preempted}, \langle r', {\lambda'} \rangle \rangle:$
        $\texttt{if} ~ (r', {\lambda'}) > ballot\_num ~ \texttt{then}$
          $\textit{active} := \texttt{false}$;
          $ballot\_num := (r' + 1, self())$;
          $\textit{spawn}(\textit{Scout}(self(), \textit{acceptors}, ballot\_num))$;
        $\texttt{end if}$
      $\texttt{end case}$
    $\texttt{end switch}$
  $\texttt{end for}$
$\texttt{end process}$

The leader starts by spawning a scout for its initial ballot number, and then enters into a loop awaiting messages. There are three types of messages that cause transitions:

  • $\langle \textbf{propose}, s, c \rangle$: A replica proposes command $c$ for slot number $s$.
  • $\langle \textbf{adopted}, ballot\_num, \textit{pvals} \rangle$: Sent by a scout, this message signifies that the current ballot number $ballot\_num$ has been adopted by a majority of acceptors. If an \textbf{adopted} message arrives for an old ballot number, it is ignored. The set $\textit{pvals}$ contains all pvalues accepted by these acceptors prior to $ballot\_num$.
  • $\langle \textbf{preempted}, \langle r', {\lambda'} \rangle \rangle$: Sent by either a scout or a commander, it means that some acceptor has adopted $\langle r', {\lambda'} \rangle$. If $\langle r', {\lambda'} \rangle > ballot\_num$, it may no longer be possible to use ballot $ballot\_num$ to choose a command.

A leader goes between $passive$ and $active$ modes. When passive, the leader is waiting for an $\langle \textbf{adopted}, ballot\_num, \textit{pvals} \rangle$ message from the last scout that it spawned. When this message arrives, the leader becomes active and spawns commanders for each of the slots for which it has a proposed command, but must select commands that satisfy Invariant L2. We will now consider how the leader goes about this.

When active, the leader knows that a majority of acceptors, say $\cal A$, have adopted $ballot\_num$ and thus no longer accept pvalues for ballot numbers less than $ballot\_num$, because of Invariants A1 and A2. In addition, it has all pvalues accepted by the acceptors in $\cal A$ prior to $ballot\_num$. The leader uses these pvalues to update its own proposals variable. There are two cases to consider:

  • If, for some slot $s$, there is no pvalue in $\textit{pvals}$, then, prior to $ballot\_num$, it is not possible that any pvalue has been chosen or will be chosen for slot $s$. After all, suppose that some pvalue $\langle b, s, c \rangle$ were chosen, with $b < ballot\_num$. This would require a majority of acceptors ${\cal A}'$ to accept $\langle b, s, c \rangle$, but we have responses from a majority ${\cal A}$ that have adopted $ballot\_num$ and have not accepted, nor can accept, pvalues with a ballot number smaller than $ballot\_num$, by Invariants A1 and A2. Because both ${\cal A}$ and ${\cal A}'$ are majorities, ${\cal A} \cap {\cal A}'$ is non-empty---some acceptor in the intersection must have violated Invariant A1, A2, or A3, which we assume cannot happen. Because no pvalue has been or will be chosen for slot $s$ prior to $ballot\_num$, the leader can propose any command for that slot without causing a conflict on an earlier ballot, thus enforcing Invariant L2.
  • Otherwise, let $\langle b, s, c \rangle$ be the pvalue with the maximum ballot number for slot $s$. Because of Invariant A4, this pvalue is unique---there cannot be two different commands for the same ballot number and slot number. Also note that $b < ballot\_num$, because acceptors only report pvalues they accepted before adopting $ballot\_num$. Like the leader of $ballot\_num$, the leader of $b$ must have picked $c$ carefully to ensure that Invariant L2 holds, and thus if a pvalue is chosen before or at $b$, the command it contains must be $c$. Since all acceptors in $\cal A$ have adopted $ballot\_num$, no pvalues between $b$ and $ballot\_num$ can be chosen, by Invariants A1 and A2. Thus, by using $c$ as a command, $\lambda$ enforces Invariant L2.

This inductive argument is the crux for the correctness of the Synod protocol. It demonstrates that Invariant L2 holds, which in turn implies Invariant A5, which in turn implies Invariant R1 that ensures that all replicas apply the same operations in the same order.

Back to the code, after the leader receives $\langle \textbf{adopted}, ballot\_num, \textit{pvals} \rangle$, it determines for each slot the command corresponding to the maximum ballot number in $\textit{pvals}$ by invoking the function $\textit{pmax}$. Formally, the function $\textit{pmax}(\textit{pvals})$ is defined as follows:

$ \textit{pmax}(\textit{pvals}) \equiv \{ \langle s, c \rangle ~|~ \exists b: \langle b, s, c \rangle \in \textit{pvals} ~\wedge \\ ~~~~~~~~~~\forall b', c': \langle b', s, c' \rangle \in \textit{pvals} \Rightarrow b' \le b ~\} $

The update operator $\lhd$ applies to two sets of proposals. $x \lhd y$ returns the elements of $y$ as well as the elements of $x$ that are not in $y$. Formally:

$ x \lhd y \equiv \{ \langle s, c \rangle ~|~ \langle s, c \rangle \in y ~\vee \\ ~~~~~~~~~~~~~~~~~~~~~~~~~~ (\langle s, c \rangle \in x ~ \wedge \not\exists c': \langle s, c' \rangle \in y) \} $

Thus the line $\textit{proposals} := \textit{proposals} \lhd \textit{pmax}(\textit{pvals});$ updates the set of proposals, replacing for each slot number the command corresponding to the maximum pvalue in $\textit{pvals}$, if any. Now the leader can start commanders for each slot while satisfying Invariant L2.

If a new proposal arrives while the leader is active, the leader checks to see if it already has a proposal for the same slot (and has thus spawned a commander for that slot) in its set \textit{proposals}. If not, the new proposal will satisfy Invariant L2, and thus the leader adds the proposal to $\textit{proposals}$ and spawns a commander.

If either a scout or a commander detects that an acceptor has adopted a ballot number $b$, with $b > ballot\_num$, then it sends the leader a $\texttt{preempted}$ message. The leader becomes passive and spawns a new scout with a ballot number that is higher than $b$.

Below is an example of a leader $\lambda$ spawning a scout to become active, and a client $\kappa$ sending a request to two replicas $\rho_1$ and $\rho_2$, which in turn send proposals to $\lambda$.