Lab 4.5: Evaluating Algorithms
Bard College – Computer Science – Data Structures

We’ll practice the experimental approach for evaluating algorithms outlined in section 1.4 (pg 174–179) of our textbook.

Insertion Sort (Part 1; we’ll do this together)

First, let’s develop a run-time model T(N)T(N) of InsertionSort.  Do we expect linear, quadratic, cubic, linearithmic growth? 

  1. Gather run-time statistics of varying input sizes (e.g.  1K, 2K, 4K, 8K, 16K). 
  • (Best to record many experiments per input size, why?) 

  • You can add StopWatch to the main of Insertion.java or just use BASH’s built-in time
  • In the shell type:
      export CLASSPATH="algs4.jar:."
      time java -Xint edu.princeton.cs.algs4.Insertion < 8Kints.txt
  1. Enter those findings (NN and T(N)T(N)) into the spreadsheet.
  1. Plot the running time of the algorithm (a scatter chart with standard axes and a trend line).
  1. Log-transform the data.
  1. Plot the log-transformed data and compute the best-fit line (i.e., linear trend-line). 
  • You can use the =SLOPE(D2:D16, C2:C16) and =INTERCEPT(D2:D16, C2:C16) formulas for finding the equation of the best-fit line.
  1. Report your estimate of T(N)=aNmT(N)=aN^m and use it to predict the run-times of the various input sizes (pg. 176).
  1. Reflect on how well your model summarizes and predicts (32K) run-times.

CAVEATS: One downside of this log-log line fitting method is that it does not treat all of the points the same; the smaller values, which are pretty noisy, have a disproportionate influence on the fitted model. One way to (somewhat) alleviate this problem, is to only include the data points that take > 0.25 seconds or so, which likely is a different set of inputs for different algorithms. Also, the -Xint flags prevents java from doing some fancy optimizations which will slow our programs but make building a reliable model easier.

Insertion Sort (Part 2)

Build a model of the best-case scenario for insertion sort—when it is asked to sort a set of numbers that is already sorted. 

export CLASSPATH="algs4.jar:."
java edu.princeton.cs.algs4.Insertion < 8Kints.txt > 8Kints_sorted.txt
time java -Xint edu.princeton.cs.algs4.Insertion < 8Kints_sorted.txt

How does this run-time model compare with the average case we saw earlier?

TSP Evaluation

Run this kind a similar analysis for your insertNearest, insertSmallest and insert (LIFO) methods. The TSPTimer  class is useful for this. Create three predictive models of their run-time.

If your TSP still isn’t quite working you can use the Bonus-1 below in its place.

Submission

Use Dropbox paper (or something similar) to describing your findings: include the graphs, tables of the data, and the resulting mathematical models.  Include as a PDF is your Lab 4 (TSP) submission. Also, submit your spreadsheet via Google Classroom.

BONUS-1: Reversing a String

A common programming task is building a larger string from smaller pieces. For example, imagine we want to reverse all the characters in a stream or file.

$ echo "hello" | java -Xint everse 
$ olleh

Our first instinct is to use string concatenation:

public class Reverse {
    public static void main(String[] args) {