Lab 10: IMDB as an Undirected Graph
Bard College – Computer Science – Data Structures

In this lab, we will explore undirected graphs using the book's IMDB actor-movie dataset. Specifically, we will investigate degrees of separation (pg. 548-555) between actors and movies.

Updated (2017) and larger (~50K and 300K) versions of movie.txt can be found on Google Classroom. Those files use == as the separator marker.

Questions

For each question, supply an answer along with an explanation of how you arrived at that answer. 

  1. How many movies has Kevin Bacon acted in? [SymbolGraph.java]
  1. How many actors have co-starred with Kevin Bacon?
  1. Co-Stars. (a) Find all the actors who have co-starred with Kevin Bacon at least twice. (b) What pair of actors has co-starred most often?
  1. Find the Bacon # of all the actors in the graph; Who has the largest and second largest Bacon #'s? [BaconHistogram.java] and [DegreesOfSeparation.java]
  1. Center of the Hollywood universe. We can measure how good of a center that Kevin Bacon is by computing their Hollywood number. The Hollywood number of Kevin Bacon is the average Bacon number of all the actors. The Hollywood number of another actor is computed the same way, but we make them be the source instead of Kevin Bacon. Compute Kevin Bacon's Hollywood number and find an actor and actress with better Hollywood numbers." (From Sedgewick and Wayne)
  1. Discover something else interesting about Hollywood using this data set.

Submission

Submit via moodle a PDF of your lab report.
lab10.pdf

Some Unix Tips

Viewing the whole file
cat movies.txt

Viewing a little bit of the file
less movies.txt

Counting the number of lines in a file:
wc -l movies.txt

Looking at the first 7 lines of a file:
head -7 movies.txt

Looking at the last 7 lines of a file:
tail -7 movies.txt

Searching a file for a term:
grep "Footloose (1984)" movies.txt

uniq is another useful command line tool. (type man uniq to learn about it)

Of course all these tools can be composed together with PIPES (|) or FILE REDIRECITON (<>). UNIX ROCKS!

You can send the output of a program to a file, or get the input from a file.
echo "Hello" > out.txt
rev < out.txt

You can send an input to a program with echo and a pipe rather than typing manually:
echo "Hello" | rev