Simulations of the fast probabilistic consensus protocol (FPC)

Wind extinguishes a candle and energizes fire. Likewise with randomness, uncertainty, chaos: you want to use them, not hide from them. You want to be the fire and wish for the wind.

Nassim Nicholas Taleb

TL;DR

The fast probabilistic consensus (FPC), introduced recently by Serguei Popov (paper) terminates in most of the times in a minimal number of steps and is much faster than the theoretical upper bounds given by Popov. It is claimed that an additional layer of randomness allows the protocol to still work in critical situations where other protocols fail. These simulations show that the FPC is indeed very robust under various attack scenarios ->.

Introduction and set-up

We perform a first simulation-based study on the fast probabilistic consensus proposed by Serguei Popov (paper). Since the protocol is very safe and fast in non-critical situations, this study is mainly about attack strategies that try to reach disagreement of the honest nodes and considers mostly critical situations. The post is not intended to be a thorough scientific study of the FPC; it justs collects various observations in order to illustrate its functionalities.

Question

How can a distributed network find consensus on the value of a bit?

Previous answers

The consensus protocols known until 2018 are essentially the classical protocols where everybody communicates with everybody, and the Nakamoto consenus. The latter became famous since it lies at the heart of the Blockchain technology. However, both protocols have serious restrictions. In order to appreciate the importance of the new approach we name some of the inconvenient properties of these models. The classical protocol has quadratic message complexity, requires precise membership knowledge and is permissioned. The Nakamoto protocol is robust and does not require precise memberships. However, it has a high latency, a low throughput and is not sustainable. Moreover, both protocols do not scale. Recently, other approaches were proposed in the crypto community, e.g. Snowflake or the Cellular automaton approach by NKN. However, we want to underline here that outside the crypto community this topic was addressed much earlier and refer to Distributed Binary Consensus in Networks with Disturbances for a summary and further references. The literature on majority voting without Byzantine disturbances is vast; we content ourselves in giving only one of the most recent references Communication cost of consensus for nodes with limited memory and the most common reference wikipedia.

All of these protocols (except the Nakamoto protocol) rely essentially on the majority rule and follow the probably most natural intuition. A node queries other nodes about their opinions and adopts the opinion of the majority. Several variations of this model have been proposed in order to guarantee convergence of the opinions in the sense that finally all nodes share the same opinion. An additional difficulty is that the consensus protocol should be robust against network faults and malicious attackers.

What is new?

The FPC brings in a decentralized random variable that allows theoretical results on the convergence rate and safety of the protocol.
Under the assumptions of the research paper the protocol is proven to be safe! However, the theoretical results seem far from optimal and the formulas may even seem frightening for people with a sound background on mathematics.

The model

We assume that n nodes want to agree on the value of a bit and that every node has access to any other node in the network. A part of the nodes are Byzantine, i.e., they are controlled by an adversary who intends to either delay the consensus, break it (i.e., make at least a couple of honest nodes come to different conclusions), or change the initial opinions of the honest nodes. The proportion of adversarial nodes is denoted by q.

The aim of the consensus protocol is the following:

  • if, initially, no significant majority of nodes prefer 1, then the final consensus should be 0 whp;
  • if, initially, a supermajority of nodes prefer 1, then the final consensus should be 1 whp.

Here, significant majority and supermajority are still some loosely defined notions. However, you can think of significant majority as something statistically different from the 50:50 situation; for example, the proportion of 1-opinion is greater than \alpha for some fixed \alpha>0.5. A supermajority is something already close to consensus, e.g. more than 90% of all nodes have the same opinion.

The protocol

The protocol depends on various parameters:

  • 0.5 < a \leq  b < 1: the threshold limits in the first round;
  • \verb?beta?\in (0,0.5): threshold limits in the subsequent rounds;
  • \verb?m_0?: the cooling-off period;
  • \verb?l?: the determination length; the number of consecutive rounds (when the cooling-off period is over) with the same opinion after which it becomes final for one node.

Important, each node decides locally when its opinion is fixed. Once the opinion is fixed, the node stops to ask queries but continues to answers queries.
The protocol now goes as follows:

  • Each node decides on the initial value of the bit, according to some reasonable (local) rule.
  • In the first round each node j randomly queries \verb?k? nodes and these nodes transmit their initial opinions. Each node j calculates \eta_j; the number of 1-opinions among the \verb?k? queries.
  • A realization X_1 of a Unif[a,b]-distributed random variable is revealed to all the nodes. This is where the centralized random variables kicks in.
  • Each node j uses the following decision rule: if \eta_i\geq X_i, it adopts opinion 1, otherwise it adopts opinion 0. This terminates the first round.

In the subsequent rounds, the dynamics are very similar:

  • In the round m\geq 2, each honest node j randomly queries \verb?k? nodes, and records the number \eta_j of 1-opinions it receives.
  • A realization X_m of a Unif[\verb?beta?,1-\verb?beta?]-distributed random variables is revealed to all the nodes.
  • Each node j which does not yet have final opinion uses the following decision rule: if \eta_j\geq X_m, it adopts opinion 1, otherwise it adopts opinion 0.
  • A node finalizes its opinion if it has the same opinion during \verb?l? consecutive rounds after the cooling-off period.

How a protocol can fail

There are mainly three different ways a consensus protocol may fail:

  1. Agreement failure — protocol terminates but the final opinions of the honest nodes do not agree;
  2. Termination failure — no consensus is reached after a very long period of time;
  3. Integrity failure — the outcome of the protocol does not reflect the initial opinions of the honest nodes.

Adversarial strategies

The original paper distinguishes between two different kind of adversaries.

  • Cautious adversary: any adversarial node must maintain the same opinion in the same round, i.e., respond the same value to all the queries it receives in that round.
  • Berserk adversary: an adversarial node may respond different things to different queries in the same round.

The obtained theoretical bounds hold true for any possible strategy of the adversary. In order to perform simulations, however, we have to consider concrete adversarial strategies.
In this study we assume that initially there is a majority of 1’s among the honest nodes and we consider the following different strategies of the adversary:

  1. Always transmits 0; the adversary tries to turn the initial majority of 1’s. In other words, the adversary tries to reach an integrity failure.
  2. Transmits the opinion of the minority of the honest nodes of step n-1; in this strategy the adversary tries to change the initial majority and to delay the protocol. The adversary is still cautious, since it transmits the same opinion in the same round.
  3. The third strategy is Berserk and omniscient. The adversay waits until all honest nodes received opinions from all other honest nodes. The adversary then calculates the median of these intermediate \eta and sends 1’s to the nodes whose intermediate \eta is larger than median of the intermediate \eta‘s of all honest nodes, and 0 to the others. This strategies tries to subdivide the honest nodes into two equally sized groups of different opinions. It is not optimal, but its implementation is easy and fast.
  4. The forth strategy is cautious, however it is more sophisticated. It tries to stabilize the median around 0.5 in such a way that the variance of the \eta‘s becomes maximal.
  5. The fifth and last strategy is probably the worst case scenario. It is the Berserk version of the forth strategy.

It is is unlikely that an adversary might be able to adapt the last two strategies, since the optimization involved might take too long, and it seems hard to come up with a significantly better strategy to break the safety of the protocol. Let us note here, that our main focus lies on adversarial strategies that try to achieve a disagreement. For this reason strategy 5 is considered here as the worst case scenario. We do not present results on adversarial strategies for termination failure here since we believe that a disagreement failure is the worst kind of failure.

Sometimes we speak of the safety of the protocol meaning hat finally all honest nodes have agreed on the same opinion.

Choices of parameters

In this study we fix the following parameters for all simulations:

  • \verb?n?=1000: the number of nodes
  • \verb?a?=0.55 and \verb?b?=0.65: (the choice of these parameters will depend strongly on the precise definition of “majorities”.)
  • \verb?m_0?=2: the length of the warm-up phase
  • \verb?l?=3: length of “determination phase”

The following list of parameters will vary througout the simulations

  • \verb?k?: the number of queries each node sends
  • \verb?max_it?: maximal number of iterations before consensus protocol is stopped
  • \verb?beta?: the support of the threshold variable from step 2 onwards
  • \verb?adv_strategy?: the adversarial strategy
  • \verb?q?: proportion of adversary nodes

For each setting of the parameters 1000 simulations were performed.

Simulations

We consider adversarial strategy 2 and set \verb?k?=20, \verb?beta?=0.3 and \verb?q?=0.1. In the following we present histograms of the number of iterations the consensus protocol ran. Four cases of initial proportions \verb?p_0? are treated. In each case the proportion of adversal nodes was chosen as \verb?q?=0.1.
plot of chunk unnamed-chunk-2

Under these choices of parameters the consensus protocol is very fast.
We also see that convergence is the slowest for \verb?p_0?=0.7. This is not astonishingly since in this case the total proportion (among honest and malicous nodes) of initial is 0.63 which lies in the interval [a,b].

But is consensus always achieved and which decision is finally taken?

We first look at the percentage of cases when there was safety, i.e. all honest nodes ended up with the same opinion:
plot of chunk unnamed-chunk-3

This may not look too convincing for the values ob \verb?p_0? close to the interval [a,b]. However, it is worth to inspect the cases where no consenus was obtained. In order to do this, we investigate the mean opinion among the honest nodes after the protocol. If we round this value to the first decimal we obtain the following histogram.

plot of chunk unnamed-chunk-4

This shows that the protocol (with the choice of the parameters above) does not work well for all choices of initial opinions.

Let us check if an adjustment of the parameter \verb?k? may improve the situation. Here the histograms for the number of needed iterations for \verb?p_0?=0.7 are given for four different choices of \verb?k?.

plot of chunk unnamed-chunk-5

We see that an increase of the number of queries leads to a speed-up of the protocol. Even better, the safety increases as well:

Number of queries Agreement rate
20 0.952
35 0.989
50 0.996
100 0.999

Breaking the protocol

Let us learn from failure and put more stress on the protocol in increasing the proportion of adversary nodes to \verb?q?=0.3. We set \verb?k?=50.

plot of chunk unnamed-chunk-8

Under these choices of parameters the consensus protocol still terminates in all four cases of initial opinions but the convergence is much slower.

The slower convergence is not the only problem here. In fact, we have a safety issue:

Initial mean opinion p_0 Agreement rate
0.5 0.727
0.7 0.724
0.9 0.738

plot of chunk unnamed-chunk-10

Fixing the failure

Increasing the number of queries is one key to fix the above situation.
We set \verb?p_0?=0.7, \verb?q?=0.3, \verb?beta?=0.3 and consider adversary strategies 1 to 3:

plot of chunk unnamed-chunk-12

Increasing the number of queries seems to allow to control situations with 30% of adversal nodes. However, to query more than a quarter of the nodes is not a scalable method. Another solution is to adjust the parameter \verb?l?. We will see this in more details at the end of the post ->.

Evolution of the consensus protocol

So what happens exactly during a consensus protocol? How fast are opinions fixed for given nodes?

In the following graph each line shows the evolution of the mean opinion among the honest nodes. The parameters are \verb?p_0?=0.7, \verb?q?=0.1, \verb?k?=50, \verb?beta?=0.3 and we consider adversarial strategy 2.
plot of chunk unnamed-chunk-16

We see that in almost all cases the mean opinion attains 0 or 1 pretty fast.
The following graph shows the percentage of honest nodes with fixed opinions; again each path represents a realization of the consensus protocol:

plot of chunk unnamed-chunk-17

Gain of decentralized randomization

One essential difference of the protocol proposed by Popov and the other protocols is the fact that thresholds are random and that this randomness is decentralized. We study the impact of this randomness in considering adversary strategies 4 and 5: let us vary the support of the random threshold \verb?beta?. The other parameters are \verb?p_0?=0.7, \verb?q?=0.1, and \verb?k?=20. We consider first adversarial strategy 4.

plot of chunk unnamed-chunk-18

Note that \verb?beta?=0.5 means that the threshold in rounds m \geq 2 is always 0.5 and therefore non-random. A larger \verb?beta? means less randomness.

This clearly illustrates that an increase of randomness slows down the protocol and that the non-random setting is not robust against this attack!

Choice of beta Agreement rate
0.1 0.973
0.3 0.966
0.5 0.538

plot of chunk unnamed-chunk-20

Against the berserk

We now consider adversarial strategy 5.

plot of chunk unnamed-chunk-22
Let us take a look what happens during the protocol. We see how the adversary manages to keep the opinion around 0.5 while more and more honest nodes fix their opinion. We consider \verb?q?=0.1, \verb?k?=20, \verb?p_0?=0.7 and \verb?beta?=0.5.

plot of chunk unnamed-chunk-23plot of chunk unnamed-chunk-23

We compare the safety of the protocol under the attack strategies 4 and 5.

plot of chunk unnamed-chunk-24

Again an increase of the number of queries would increase the safety as follows for adversarial strategy 5. We set \verb?p_0?=0.7, \verb?q?=0.1, \verb?l?=3, and \verb?beta?=0.3.

Number of queries Agreement rate
20 0.886
50 0.980
100 0.994
150 0.998

Another possibility is to play with the determination length \verb?l? while keeping \verb?k?=50:

Determination length l Agreement rate
3 0.980
5 0.994
10 1.000

Let us finally take a look at how the fixed protocol works ‘pathwise’: \verb?p_0?=0.7, \verb?q?=0.1, \verb?l?=10, and \verb?beta?=0.3.
plot of chunk unnamed-chunk-27plot of chunk unnamed-chunk-27

Price of randomness?

Everything has its price; so does the additional randomness. In the case \verb?p_0?=0.7 the additional randomness guaranteed safety in critical situations. However, this feature has a tradeoff. For instance, in the case of an initial opinion \verb?p_0?=0.8 the additional randomness slows down the protocol and turns it less safe. However, at least in this case, the advantages of the additional randomness seem to outweight the small loss in performance and safety.

plot of chunk unnamed-chunk-29

Choice of beta Agreement rate
0.1 0.995
0.3 1.000
0.5 1.000

Discussion and outlook

It seems valid that the additional layer of randomnes introduced through the decentralized random generator allows to reach consensus in situations where without this feauture the protocol would fail. The ‘right’ amount of randomness, for example encoded in the choice of \verb?beta?, seems to depend on the particular situation.

The convergence rate obtained in these simulations are far better than the theoretical bounds obtained by Popov. Moreover, in non-critical situations consensus is almost always achieved in the minimal number of steps. However, the FPC depends on choices of many different parameters. In the above study we only considered few of them and their influence on the outcome of the protocol have yet to be studied in more details.