Bard College – Computer Science – Data Structures
In our first lab we will explore Java, the UNIX command line, the textbook’s standard library, and begin employing an empirical approach to algorithms that we will adopt throughout the course. Specifically, we will be looking at two common computations: shuffling an array and calculating the standard deviation; both are subtler than you might think.
You will create two Java classes:
Stats — A program that will report descriptive statistics of a set of numbers.
Hist — A program that displays a histogram of a set of numbers.
Both of these programs should be useful in analyzing and presenting the performance of various data structures over the semester.
Shuffling
Shuffling a set of objects is a common task, for example, employed in online poker. The textbook provides a shuffle method(often referred to as the Knuth of Fisher-Yates shuffle) on page 32:
The ShuffleTestclass you are provided runs an experiment you can use to compare those two methods. Read, understand and comment ShuffleTest .
Statistics
Given a set of data [x1,x2,...,xn], the first measure we often calculate is the mean or average; the mean summarizes the data with a single number. Mathematically, we can write it as:
x¯=n1∑i=1nxi
The Average class described on page 39 turns this summation into a loop(a trick we will often see). Implemented as on online, streaming, calculation, i.e., the program doesn’t know the N ahead of time, it works on a variable sized input set.
Shuffling
public static void shuffle(double[] a) {
int n = a.length;
for (int i = 0; i < n; i++) {
int r = i + uniform(n-i); // between i and n-1
double temp = a[i];
a[i] = a[r];
a[r] = temp;
}
}
public static void shuffle(double[] a) {
int n = a.length;
for (int i = 0; i < n; i++) {
int r = uniform(n); // between 0 and n-1
double temp = a[i];
a[i] = a[r];
a[r] = temp;
}
}
Statistics
public class Average {
public static void main(String[] args) {
int count = 0; // number input values
double sum = 0.0; // sum of input values
// read data and compute statistics