Fourier Optimistic Finality
The recent concise proof of the nearly thirty year old Sensitivity Conjecture by Hao Huang inspired me to revisit an old hypothesis of mine: the Fourier analysis of boolean functions can improve optimistic finality in voting systems with adversaries (e.g. PoS protocols). Let’s first take a historical detour before returning to the Sensitivity Conjecture and it’s effect on optimistic voting. We’ll first very briefly cover how the Fourier analysis of boolean functions is related to voting systems via two theorems: Arrow’s and Majority is Stablest. Then, we’ll cover how the Huang’s proof of the Sensitivity Conjecture enhances guarantees for voting systems against adversaries. Finally, we’ll use this to argue that it might be possible for voting functions in PoS systems to be designed such that we can reach optimistic finality while receiving Θ(N)\Theta(\sqrt{N}) messages from validators in a committee (normally, BFT assumptions forces us to receive Ω(N)\Omega(N) messages)
 
A boolean function [0] is a function f:{0,1}n{0,1}f : \{0, 1\}^n \rightarrow \{0, 1\}. We can view it as a voting mechanism in that nn parties each construct a vote vi{0,1}v_i \in \{0,1\}, jointly create a bitstring v{0,1}nv \in \{0,1\}^n, and decide on a winner by evaluating f(v)f(v). How do we analyze ff as a voting mechanism? One way is to try to find worst case behavior for ff, such as proving that there is a dictator — ff only changes when viv_i changes. Another method is to look at the average behavior of ff over all possible inputs votes and to see if the average behavior is ‘usually’ the desired behavior. In this context, ‘desired’ refers to the restriction of some type of fairness metric, such as the average influence of any single voter on the election outcome. The final method is to condition onto a subset of strings — say the strings such that there the number of 11s is never more than 0.6n0.6n — and then compute averages and tail bounds. But in order to do this in a meaningful manner, we will need to figure out how to use traditional, continuous probability tools. This involves porting over some real analysis and taking advantage of the fact that the Central Limit Theorem/Berry-Esseen theorem say that as nn\rightarrow\infty, f(v)f(X)f(v) \approx f(X), where XX is a Gaussian random field [1]. Ryan O’Donnell’s excellent book, Analysis of Boolean Functions, covers how to formalize this and maps all of your favorite tools from probability (e.g. log Sobolev inequalities) over to the discrete realm. The main tool that we will talk about here is the boolean Fourier Transform. While I won’t go into the details, please do trust that the standard L2L^2 Fourier theory carries over to the discrete case.

Let [n]={1,2,,n}[n] = \{1,2,\ldots,n\}. By viewing {0,1}n\{0,1\}^n as the abelian group Gn=(Z/2Z)nG_n = (\mathbb{Z} / 2 \mathbb{Z})^n with 2n2^ncharacters indexed by sets S[n]S \subset [n],  χS:GnC,χS(v)=(1)iSvi\chi_S : G_n \rightarrow \mathbb{C}, \chi_S(v) = (-1)^{\sum_{i\in S} v_i}, we can write the Fourier Transform of ff as:

f(v)=S[n]f^(S)χS(v)f(v) = \sum_{S\subset [n]} \hat{f}(S) \chi_S(v)

Each Fourier coefficient f^(S)R\hat{f}(S) \in \mathbb{R} can be thought of as the sensitivity of ff to the voters in the set SS. If our voting function ff admits a dictatorship for voter ii, then f(v)=χ{i}(v)f(v) = \chi_{\{i\}}(v) and if our voting function is uniformly controlled by a cartel S[n]S \subset [n], then f(v)=χS(v)f(v) = \chi_{S}(v). A very simple measure of how ‘big’ cartels of influence can be is known is the degree of a boolean function:

deg(f)=max{S:S[n],f^(S)0}\mathsf{deg}(f) = \max\{ |S| : S \subset [n],\,\hat{f}(S) \neq 0\}

These tools provide a route to proving that a voting mechanism is bad: we can use the boolean Fourier Transform to try to identify when a dictatorship exists. This is exactly what Gil Kalai did in 2003 when he constructed a Fourier Theoretic proof of Arrow’s impossibility theorem, which can be phrased directly in terms of boolean functions. We need two more notions before stating the theorem: 
  1. A voting mechanism ff unanimous if every one votes for the same candidate, then that candidate wins: f(1,,1)=1,f(1,,1)=1f(1,\ldots,1) = 1, f(-1,\ldots, -1) = -1
  1. If we have nn candidates in an election and (n2)\binom{n}{2} functions fij,i<jf_{ij}, i<j where fij(v)=1f_{ij}(v) = 1 if ii is preferred to jj and 1-1 otherwise. We say that this ranked election has a Condorcet winner if  there is a candidate ii who wins all elections that it is in (e.g. if i<jfij=1,j<i,fji=1\forall i < j \, f_{ij} = 1, \forall j < i, f_{ji} = -1). 
A three party election with candidates A,  B, and C does not have a Condorcet winner if, say, A  beats B, B beats C, and C beats A. Arrow’s theorem then states that the only unanimous mechanism that guarantees a Condorcet winner is dictatorship. Kalai proves this for the case of n=3n=3 by performing a Fourier analysis of the three functions fAB,fBC,fACf_{AB},f_{BC}, f_{AC} and showing that they can only guarantee a Condorcet winner if all three are dictatorships. While this is a negative result, there is a positive probabilistic result associated to this (just as blockchain systems provide probabilistic positive alternatives to the CAP theorem) called Guilbaud’s Formula: The probability of a Condorcet winner if ff is majority voting and as nn\rightarrow\infty is 32πarccos(13)91.7%\frac{3}{2\pi} \arccos \left(-\frac{1}{3}\right) \approx 91.7\%. The appearance of 13-\frac{1}3{} is very closely related to the 33% of the Byzantine Generals Problem.

At this point, you might be wondering how π\pi and arccos\arccos show up in a discrete probability problem. This comes from trying to study the stability of a voting mechanism, which represents how sensitivity the evaluation of a voting function to misread / Byzantine votes. We define the ρ\rho-stability of a boolean function to be Stabρ(f)=EvNρ(x)[f(x)f(y)]\mathsf{Stab}_{\rho}(f) = \mathsf{E}_{v \sim N_{\rho}(x)}[f(x)f(y)] where the expectation is taken over Nρ(x)N_{\rho}(x), which is the set of boolean strings that are ρ\rho-correlated to xx [2]. The Majority is Stablest theorem basically says that the majority vote function is the most stable function [3]. If Majn\mathsf{Maj}_n is the majority vote function on strings of length nn, then

 limnStabρ(Majn)=12πarccosρ\lim_{n\rightarrow\infty} \mathsf{Stab}_{\rho}(\mathsf{Maj}_n) = 1 - \frac{2}{\pi}\arccos \rho

and the theorem says that all other functions have limiting stability less than this bound. From this formula, we cursorily see that the choice of ρ=13\rho = -\frac{1}{3} in Guilbaud’s formula corresponds to completely anti-correlated Byzantine votes taking up, on average, 13\frac{1}{3} of votes. This uncanny similarity to BFT systems is more than skin-deep and we will try to use the Sensitivity Conjecture to dive further into this.
 
The Sensitivity Conjecture (now theorem) is a thirty year old problem that was first posed by Nisan and Szegedy in the context of understanding limits to approximation algorithms. Intuitively, the sensitivity of a voting function is the number of voters, who if they changed their individual votes, would cause the overall vote to change. The boolean Fourier transform illustrates that there can be many types of sensitivity, as each subset of voters can influence the final voting outcome independently and a voting mechanism can vary in response to changes from individual voters or from blocks of voters. At a high level, the sensitivity conjecture measures how much individual vote changes are tied to blocks of vote changes. First, let’s define a few things for a voting function f:{0,1}n{0,1}f : \{0,1\}^n \rightarrow \{0,1\} at x{0,1}nx \in \{0,1\}^n:
  • For S[n]S \subset [n], let xS{0,1}nx^S\in\{0,1\}^n  be xx except with the votes at set SS flipped
  • The sensitivity of ff at xx counts the number of voters where if we flip their vote, we change the entire vote: S(f,x)=card({i:f(xi)f(x)})S(f, x) = \mathsf{card}(\{i:f(x^i) \neq f(x)\}).
  • The block sensitivity of ff at xx is the maximum number of disjoint sets that we can flip to change ff: BL(f,x)=max{B:B2[n],BCBBC=,BBf(xB)f(x)}\mathsf{BL}(f,x) = \max\{ |\mathcal{B}| : \mathcal{B} \subset 2^{[n]}, \forall B\ne C \in \mathcal{B}\, B\cap C = \emptyset, \forall B \in \mathcal{B}\, f(x^B) \neq f(x)\} 
  • Finally define the sensitivity and block sensitivity of ff as:
  • S(f)=maxx{0,1}nS(f,x)S(f) = \max_{x\in\{0,1\}^n} S(f,x)
  • BL(f)=maxx{0,1}nBL(f,x)\mathsf{BL}(f) = \max_{x\in\{0,1\}^n} \mathsf{BL}(f, x)
It is clear that S(f)BL(f)S(f) \leq \mathsf{BL}(f) as a valid partition in the maximum is the singleton partition, Bsingleton={{i}:i[n]}\mathcal{B}_{singleton} = \{\{i\} : i \in [n]\}. But can we bound BL(f)\mathsf{BL}(f) in terms of S(f)S(f)? This would tell us how to relate the effect of cartels of voters who change to that of individuals. The Sensitivity Conjecture states that:

BL(f)S(f)2\mathsf{BL}(f) \leq S(f)^2

Why is the exponent a square? Why can't it be cubic, quartic, or arbitrarily large? The main reason is that the largest known example of a difference between S(f)S(f) and BL(f)\mathsf{BL}(f) is quadratic. This example, Rubinstein’s Function, will be useful when we make the comparison to PoS voting, so we’ll go through it. We are going to define f:{0,1}4k2{0,1}f : \{0,1\}^{4k^2} \rightarrow \{0, 1\} for kNk \in \mathbb{N}. Imagine that the input is a 2k×2k2k \times 2k matrix of zeros and ones. We say that f(x)f(x) is one iff there exists a row with two ones and zero otherwise. One can show by considering inputs with one 1 per row and all zeros, that S(f)=Θ(n),BL(f)=Θ(n)S(f) = \Theta(\sqrt{n}), \mathsf{BL}(f) = \Theta(n), thus suggesting a quadratic split.

In their 1992 paper that defined the conjecture, Nisan and Szegedy proved that BL(f)2deg(f)2\mathsf{BL}(f) \leq 2 \mathsf{deg}(f)^2, which gives the result if deg(f)O(S(f))\mathsf{deg}(f) \in O(S(f)). Gotsman and Linial showed that deg(f)S(f)C\mathsf{deg}(f) \leq S(f)^C iff the following is true:

For any subset S{0,1}nS \subset \{0,1\}^n with S>2n1|S| > 2^{n-1} there must exist a point  sSs \in S that that ss has nCn^C neighbors in {0,1}n\{0,1\}^n (e.g. points that differ at exactly one place)

In July 2019, Hao Huang posted a six page preprint to arXiv which solved this problem using remarkably simple techniques. The main ingredient to the proof was the construction of a magic matrix in Z22n×2n\mathbb{Z}_2^{2^n\times 2^n} that had invariant subspaces of dimension 2n1+1,2n112^{n-1}+1, 2^{n-1}-1 with eigenvalues ±n\pm\sqrt{n}. This matrix was computed by repeated squaring of a signed matrix that was dominated entrywise by the adjacency matrix of {0,1}n\{0,1\}^n (thought of as the Hamming Graph). These eigenvalues, combined with the Cauchy interlacing theorem/Schur’s test, prove the above statement for C=2C=2. Terence Tao and Roman Karasev related this magic matrix to something more pedestrian: convolutions of mixed degree elements of a Clifford Algebra and provided a significant amount of insight into the proof [4]. Knuth and others provided more elementary (and shorter!) proofs that didn’t rely on the interlacing theorem.

Given that we have this conjecture, what does it tell us about Proof of Stake systems?
 
In a variety of PoS systems, there are a number of rounds of voting. These rounds often involve a public randomness beacon selecting subsets of participants to validate (e.g. Algorand, Cosmos, DFINITY,  ETH 2.0, Tezos) or utilize private randomness to repeatedly subsample users (e.g. Avalanche). Most of these protocols choose voting functions that are quite different from Majn\mathsf{Maj}_n:
  • Stake-weighted majority (e.g. weighted linear threshold functions)
  • Recursive voting 
  • There is a tree with 3n3^n leaves that represent the participants. Each node corresponds to Maj3\mathsf{Maj}_3 and we vote by injecting the votes at the leaves and propagating them up through the tree.
  • Independent voting per shard, collated via something like NightShade