ECON 300 Statistics & Probability

Introduction

In the discussion of small world networks, and various economic and financial matters, we must use some statistical distributions. The main thing that we will observe is that the real world has variables that are distributed in various ways. I want to build up to a discussion of “fat tails”, and in order for that to happen you must first understand what a “tail” is in the first place. We start with the most straightforward “uniform” and “binomial distribution” and build from there. 

Binomial Distribution

The binomial distribution is used to calculate the probability that a certain number of discrete events occur. In the network literature, this is often used to describe the distribution of degrees at nodes in a random network. Some nodes have more connections than others, but recall that random networks do not exhibit significant clustering or hubs.

Shown below, the binomial distribution written generally as f(x)f(x), is known as the probability density function, or PDF. This allows us to give the probability of a certain outcome.

f(x)=P(X=x)=(nx)(px)(qnx)=n!x!(nx)!pxqnxf(x)=P(X=x)=\binom{n}{x}(p^x)(q^{n-x})=\frac{n!}{x!(n-x)!}p^xq^{n-x}

Where pp is the probability of success, and q=1pq=1-p is the probability of failure. The binomial distribution function is defined as the probability of xx events occurring in nn trials. If n=1n=1 this is called the Bernoulli distribution.

For an  example, let’s say that p=0.3p=0.3 and therefore q=0.7q=0.7. The likelihood of making exactly 0 through 5 successes in 5 tries is

P(X=0)=5!0!5!(0.3)0(0.7)5=0.17P(X=0)=\frac{5!}{0!5!}(0.3)^0(0.7)^5=0.17
P(X=1)=5!1!4!(0.3)1(0.7)4=0.36P(X=1)=\frac{5!}{1!4!}(0.3)^1(0.7)^4=0.36
P(X=2)=5!2!3!(0.3)2(0.7)3=0.31P(X=2)=\frac{5!}{2!3!}(0.3)^2(0.7)^3=0.31
P(X=3)=5!3!2!(0.3)3(0.7)2=0.13P(X=3)=\frac{5!}{3!2!}(0.3)^3(0.7)^2=0.13
P(X=4)=5!4!1!(0.3)4(0.7)1=0.03P(X=4)=\frac{5!}{4!1!}(0.3)^4(0.7)^1=0.03
P(X=5)=5!5!0!(0.3)5(0.7)1=0.002P(X=5)=\frac{5!}{5!0!}(0.3)^5(0.7)^1=0.002

The cumulative distribution function, or CDF is the accumulated sum of all probabilities in some range. If it is the likelihood of getting a 0, 1, or 2 in this experiment above, then you would get a total value of 0.84 or 84% across these three combined possibilities.

The mean μ\mu and standard deviation σ\sigma of this distribution are 

μ=np\mu=np
σ=npq\sigma=\sqrt{npq}

For our example this would be 

μ=5×0.3=1.5\mu=5\times 0.3=1.5
σ=5×0.3×0.7=1.05\sigma=5\times 0.3 \times 0.7=1.05

If we looked at this in terms of expected value, where you multiply probabilities by their values, you would see the same thing for μ\mu.
E[x]=i=0Np(xi)xi=0.17×0+0.36×1+0.31×2+0.13×3+0.03×4+0.002×5=1.5E[x]=\sum_{i=0}^N p(x_i)x_i = 0.17\times 0+0.36\times 1+0.31\times 2+0.13\times 3+0.03\times4+0.002\times 5=1.5

Uniform Distribution

Another common distribution that we come across in this literature is the uniform distribution. When we are examining issues like changing the probability that a “link is rewired” we pull a random number, usually between 00 and 11. In the PDF and CDF for the uniform distribution, we are simply calculating a ratio. I provide the PDF and CDF here since this is a “simple” distribution, and it will help you understand how to make sense of the others. 

f(x)={1(ba)if axb0otherwisef(x)=\begin{cases} \frac{1}{(b-a)}& \text{if } a\leq x \leq b\\ 0& \text{otherwise} \end{cases}
F(x)=P(Xx)={0x<a(xa)(ba)if ax<baxbF(x)=P(X\leq x)=\begin{cases} 0& x<a \\ \frac{(x-a)}{(b-a)}& \text{if } a\leq x < b\\ a& x \geq b \end{cases}

Normal Distribution

The PDF for the normal distribution with mean (μ\mu) and standard deviation (σ\sigma) is

f(x)=1σ2πe(xμ)2/2σ2f(x)=\frac{1}{\sigma\sqrt{2\pi}}e^{-(x-\mu)^2/2\sigma^2} where <x<-\infty < x < \infty

Standardizing this distribution such that Z=XμσZ=\frac{X-\mu}{\sigma} gives

f(z)=12πez2/2f(z)=\frac{1}{\sqrt{2\pi}}e^{-z^2/2}

By standardizing, imagine you have some data with a mean of 72, and a standard deviation of 6. In our standardized form a value of 76, would give us a Z=76726=0.66Z=\frac{76-72}{6}=0.66. We might compare this to being the same distance from the mean in a normal distribution with a mean of zero and standard deviation of 1 (i.e., N(0,1)N(0,1)). In this instance, a value of 0.66 is exactly that many standard deviations away from the mean. If you compare the exponent in f(x)f(x) and f(z)f(z) you will see that we are simply making a replacement of nominal with standardized values.