ECON 300 Statistics & Probability
In the discussion of small world networks, and various economic and financial matters, we must use some statistical distributions. The main thing that we will observe is that the real world has variables that are distributed in various ways. I want to build up to a discussion of “fat tails”, and in order for that to happen you must first understand what a “tail” is in the first place. We start with the most straightforward “uniform” and “binomial distribution” and build from there.
The binomial distribution is used to calculate the probability that a certain number of discrete events occur. In the network literature, this is often used to describe the distribution of degrees at nodes in a random network. Some nodes have more connections than others, but recall that random networks do not exhibit significant clustering or hubs.
Shown below, the binomial distribution written generally as f(x), is known as the probability density function, or PDF. This allows us to give the probability of a certain outcome.
Where p is the probability of success, and q=1−p is the probability of failure. The binomial distribution function is defined as the probability of x events occurring in n trials. If n=1 this is called the Bernoulli distribution.
For an example, let’s say that p=0.3 and therefore q=0.7. The likelihood of making exactly 0 through 5 successes in 5 tries is
The cumulative distribution function, or CDF is the accumulated sum of all probabilities in some range. If it is the likelihood of getting a 0, 1, or 2 in this experiment above, then you would get a total value of 0.84 or 84% across these three combined possibilities.
The mean μ and standard deviation σ of this distribution are
For our example this would be
If we looked at this in terms of expected value, where you multiply probabilities by their values, you would see the same thing for μ.
Another common distribution that we come across in this literature is the uniform distribution. When we are examining issues like changing the probability that a “link is rewired” we pull a random number, usually between 0 and 1. In the PDF and CDF for the uniform distribution, we are simply calculating a ratio. I provide the PDF and CDF here since this is a “simple” distribution, and it will help you understand how to make sense of the others.
The PDF for the normal distribution with mean (μ) and standard deviation (σ) is
f(x)=σ√2π1e−(x−μ)2/2σ2 where −∞<x<∞
Standardizing this distribution such that Z=σX−μ gives
By standardizing, imagine you have some data with a mean of 72, and a standard deviation of 6. In our standardized form a value of 76, would give us a Z=676−72=0.66. We might compare this to being the same distance from the mean in a normal distribution with a mean of zero and standard deviation of 1 (i.e., N(0,1)). In this instance, a value of 0.66 is exactly that many standard deviations away from the mean. If you compare the exponent in f(x) and f(z) you will see that we are simply making a replacement of nominal with standardized values.