Fourier Optimistic Finality
The recent concise proof of the nearly thirty year old Sensitivity Conjecture by Hao Huang inspired me to revisit an old hypothesis of mine: the Fourier analysis of boolean functions can improve optimistic finality in voting systems with adversaries (e.g. PoS protocols). Let’s first take a historical detour before returning to the Sensitivity Conjecture and it’s effect on optimistic voting. We’ll first very briefly cover how the Fourier analysis of boolean functions is related to voting systems via two theorems: Arrow’s and Majority is Stablest. Then, we’ll cover how the Huang’s proof of the Sensitivity Conjecture enhances guarantees for voting systems against adversaries. Finally, we’ll use this to argue that it might be possible for voting functions in PoS systems to be designed such that we can reach optimistic finality while receiving $\Theta(\sqrt{N})$ messages from validators in a committee (normally, BFT assumptions forces us to receive $\Omega(N)$ messages)

A boolean function [0] is a function $f : \{0, 1\}^n \rightarrow \{0, 1\}$. We can view it as a voting mechanism in that $n$ parties each construct a vote $v_i \in \{0,1\}$, jointly create a bitstring $v \in \{0,1\}^n$, and decide on a winner by evaluating $f(v)$. How do we analyze $f$ as a voting mechanism? One way is to try to find worst case behavior for $f$, such as proving that there is a dictator — $f$ only changes when $v_i$ changes. Another method is to look at the average behavior of $f$ over all possible inputs votes and to see if the average behavior is ‘usually’ the desired behavior. In this context, ‘desired’ refers to the restriction of some type of fairness metric, such as the average influence of any single voter on the election outcome. The final method is to condition onto a subset of strings — say the strings such that there the number of $1$s is never more than $0.6n$ — and then compute averages and tail bounds. But in order to do this in a meaningful manner, we will need to figure out how to use traditional, continuous probability tools. This involves porting over some real analysis and taking advantage of the fact that the Central Limit Theorem/Berry-Esseen theorem say that as $n\rightarrow\infty$, $f(v) \approx f(X)$, where $X$ is a Gaussian random field [1]. Ryan O’Donnell’s excellent book, Analysis of Boolean Functions, covers how to formalize this and maps all of your favorite tools from probability (e.g. log Sobolev inequalities) over to the discrete realm. The main tool that we will talk about here is the boolean Fourier Transform. While I won’t go into the details, please do trust that the standard $L^2$ Fourier theory carries over to the discrete case.

Let $[n] = \{1,2,\ldots,n\}$. By viewing $\{0,1\}^n$ as the abelian group $G_n = (\mathbb{Z} / 2 \mathbb{Z})^n$ with $2^n$characters indexed by sets $S \subset [n]$,  $\chi_S : G_n \rightarrow \mathbb{C}, \chi_S(v) = (-1)^{\sum_{i\in S} v_i}$, we can write the Fourier Transform of $f$ as:

$f(v) = \sum_{S\subset [n]} \hat{f}(S) \chi_S(v)$

Each Fourier coefficient $\hat{f}(S) \in \mathbb{R}$ can be thought of as the sensitivity of $f$ to the voters in the set $S$. If our voting function $f$ admits a dictatorship for voter $i$, then $f(v) = \chi_{\{i\}}(v)$ and if our voting function is uniformly controlled by a cartel $S \subset [n]$, then $f(v) = \chi_{S}(v)$. A very simple measure of how ‘big’ cartels of influence can be is known is the degree of a boolean function:

$\mathsf{deg}(f) = \max\{ |S| : S \subset [n],\,\hat{f}(S) \neq 0\}$

These tools provide a route to proving that a voting mechanism is bad: we can use the boolean Fourier Transform to try to identify when a dictatorship exists. This is exactly what Gil Kalai did in 2003 when he constructed a Fourier Theoretic proof of Arrow’s impossibility theorem, which can be phrased directly in terms of boolean functions. We need two more notions before stating the theorem:
1. A voting mechanism $f$ unanimous if every one votes for the same candidate, then that candidate wins: $f(1,\ldots,1) = 1, f(-1,\ldots, -1) = -1$
1. If we have $n$ candidates in an election and $\binom{n}{2}$ functions $f_{ij}, i where $f_{ij}(v) = 1$ if $i$ is preferred to $j$ and $-1$ otherwise. We say that this ranked election has a Condorcet winner if  there is a candidate $i$ who wins all elections that it is in (e.g. if $\forall i < j \, f_{ij} = 1, \forall j < i, f_{ji} = -1$).
A three party election with candidates A,  B, and C does not have a Condorcet winner if, say, A  beats B, B beats C, and C beats A. Arrow’s theorem then states that the only unanimous mechanism that guarantees a Condorcet winner is dictatorship. Kalai proves this for the case of $n=3$ by performing a Fourier analysis of the three functions $f_{AB},f_{BC}, f_{AC}$ and showing that they can only guarantee a Condorcet winner if all three are dictatorships. While this is a negative result, there is a positive probabilistic result associated to this (just as blockchain systems provide probabilistic positive alternatives to the CAP theorem) called Guilbaud’s Formula: The probability of a Condorcet winner if $f$ is majority voting and as $n\rightarrow\infty$ is $\frac{3}{2\pi} \arccos \left(-\frac{1}{3}\right) \approx 91.7\%$. The appearance of $-\frac{1}3{}$ is very closely related to the 33% of the Byzantine Generals Problem.

At this point, you might be wondering how $\pi$ and $\arccos$ show up in a discrete probability problem. This comes from trying to study the stability of a voting mechanism, which represents how sensitivity the evaluation of a voting function to misread / Byzantine votes. We define the $\rho$-stability of a boolean function to be $\mathsf{Stab}_{\rho}(f) = \mathsf{E}_{v \sim N_{\rho}(x)}[f(x)f(y)]$ where the expectation is taken over $N_{\rho}(x)$, which is the set of boolean strings that are $\rho$-correlated to $x$ [2]. The Majority is Stablest theorem basically says that the majority vote function is the most stable function [3]. If $\mathsf{Maj}_n$ is the majority vote function on strings of length $n$, then

$\lim_{n\rightarrow\infty} \mathsf{Stab}_{\rho}(\mathsf{Maj}_n) = 1 - \frac{2}{\pi}\arccos \rho$

and the theorem says that all other functions have limiting stability less than this bound. From this formula, we cursorily see that the choice of $\rho = -\frac{1}{3}$ in Guilbaud’s formula corresponds to completely anti-correlated Byzantine votes taking up, on average, $\frac{1}{3}$ of votes. This uncanny similarity to BFT systems is more than skin-deep and we will try to use the Sensitivity Conjecture to dive further into this.

The Sensitivity Conjecture (now theorem) is a thirty year old problem that was first posed by Nisan and Szegedy in the context of understanding limits to approximation algorithms. Intuitively, the sensitivity of a voting function is the number of voters, who if they changed their individual votes, would cause the overall vote to change. The boolean Fourier transform illustrates that there can be many types of sensitivity, as each subset of voters can influence the final voting outcome independently and a voting mechanism can vary in response to changes from individual voters or from blocks of voters. At a high level, the sensitivity conjecture measures how much individual vote changes are tied to blocks of vote changes. First, let’s define a few things for a voting function $f : \{0,1\}^n \rightarrow \{0,1\}$ at $x \in \{0,1\}^n$:
• For $S \subset [n]$, let $x^S\in\{0,1\}^n$  be $x$ except with the votes at set $S$ flipped
• The sensitivity of $f$ at $x$ counts the number of voters where if we flip their vote, we change the entire vote: $S(f, x) = \mathsf{card}(\{i:f(x^i) \neq f(x)\})$.
• The block sensitivity of $f$ at $x$ is the maximum number of disjoint sets that we can flip to change $f$: $\mathsf{BL}(f,x) = \max\{ |\mathcal{B}| : \mathcal{B} \subset 2^{[n]}, \forall B\ne C \in \mathcal{B}\, B\cap C = \emptyset, \forall B \in \mathcal{B}\, f(x^B) \neq f(x)\}$
• Finally define the sensitivity and block sensitivity of $f$ as:
• $S(f) = \max_{x\in\{0,1\}^n} S(f,x)$
• $\mathsf{BL}(f) = \max_{x\in\{0,1\}^n} \mathsf{BL}(f, x)$
It is clear that $S(f) \leq \mathsf{BL}(f)$ as a valid partition in the maximum is the singleton partition, $\mathcal{B}_{singleton} = \{\{i\} : i \in [n]\}$. But can we bound $\mathsf{BL}(f)$ in terms of $S(f)$? This would tell us how to relate the effect of cartels of voters who change to that of individuals. The Sensitivity Conjecture states that:

$\mathsf{BL}(f) \leq S(f)^2$

Why is the exponent a square? Why can't it be cubic, quartic, or arbitrarily large? The main reason is that the largest known example of a difference between $S(f)$ and $\mathsf{BL}(f)$ is quadratic. This example, Rubinstein’s Function, will be useful when we make the comparison to PoS voting, so we’ll go through it. We are going to define $f : \{0,1\}^{4k^2} \rightarrow \{0, 1\}$ for $k \in \mathbb{N}$. Imagine that the input is a $2k \times 2k$ matrix of zeros and ones. We say that $f(x)$ is one iff there exists a row with two ones and zero otherwise. One can show by considering inputs with one 1 per row and all zeros, that $S(f) = \Theta(\sqrt{n}), \mathsf{BL}(f) = \Theta(n)$, thus suggesting a quadratic split.

In their 1992 paper that defined the conjecture, Nisan and Szegedy proved that $\mathsf{BL}(f) \leq 2 \mathsf{deg}(f)^2$, which gives the result if $\mathsf{deg}(f) \in O(S(f))$. Gotsman and Linial showed that $\mathsf{deg}(f) \leq S(f)^C$ iff the following is true:

For any subset $S \subset \{0,1\}^n$ with $|S| > 2^{n-1}$ there must exist a point  $s \in S$ that that $s$ has $n^C$ neighbors in $\{0,1\}^n$ (e.g. points that differ at exactly one place)

In July 2019, Hao Huang posted a six page preprint to arXiv which solved this problem using remarkably simple techniques. The main ingredient to the proof was the construction of a magic matrix in $\mathbb{Z}_2^{2^n\times 2^n}$ that had invariant subspaces of dimension $2^{n-1}+1, 2^{n-1}-1$ with eigenvalues $\pm\sqrt{n}$. This matrix was computed by repeated squaring of a signed matrix that was dominated entrywise by the adjacency matrix of $\{0,1\}^n$ (thought of as the Hamming Graph). These eigenvalues, combined with the Cauchy interlacing theorem/Schur’s test, prove the above statement for $C=2$. Terence Tao and Roman Karasev related this magic matrix to something more pedestrian: convolutions of mixed degree elements of a Clifford Algebra and provided a significant amount of insight into the proof [4]. Knuth and others provided more elementary (and shorter!) proofs that didn’t rely on the interlacing theorem.

Given that we have this conjecture, what does it tell us about Proof of Stake systems?

In a variety of PoS systems, there are a number of rounds of voting. These rounds often involve a public randomness beacon selecting subsets of participants to validate (e.g. Algorand, Cosmos, DFINITY,  ETH 2.0, Tezos) or utilize private randomness to repeatedly subsample users (e.g. Avalanche). Most of these protocols choose voting functions that are quite different from $\mathsf{Maj}_n$:
• Stake-weighted majority (e.g. weighted linear threshold functions)
• Recursive voting
• There is a tree with $3^n$ leaves that represent the participants. Each node corresponds to $\mathsf{Maj}_3$ and we vote by injecting the votes at the leaves and propagating them up through the tree.
• Independent voting per shard, collated via something like NightShade