Lab 9: A More Perfect Hash Function
Bard College – Computer Science – Data Structures

In this lab, we will explore hash functions. You are provided with three different kinds of hash functions: modular hashing, Pearson hashing, and Java 1.1 String hashing; and three different files: a dictionary, a novel, and DNA strings (you are encouraged to try your own file). Your task is to find which hash function(s) work well for those applications.
 
The below images show how different keys are distributed over a hash-table. The second is more uniformly distributed, with fewer hash collisions, and thus preferred. Neither is perfect.

Hash Functions

Modular Hashing
The first kind of hash function we will consider is modular hashing (what r works well?):

    private static int mhash(String k, int r){
        int h = 0;
        for (int i = 0; i < k.length(); i ++){
            h = (h * r) + k.charAt(i);
        }
        return h;
    }

Pearson Hashing
Next, we'll consider something known as Pearson hashing [see the reading handed out in class]:
 
    private static int phash(String k){
        int h = 0;
        for (int i = 0; i < k.length(); i++) {
            int index = h ^ k.charAt(i);
            h = T[index];
        }
        return h;
    }

Pearson hashing assumes a table of 256 permutations is available, for example:

    final static int[] T = {
        98,  6, 85,150, 36, 23,112,164,135,207,169,  5, 26, 64,165,219, //  1
        61, 20, 68, 89,130, 63, 52,102, 24,229,132,245, 80,216,195,115, //  2
        90,168,156,203,177,120,  2,190,188,  7,100,185,174,243,162, 10, //  3
        237, 18,253,225,  8,208,172,244,255,126,101, 79,145,235,228,121, // 4
        123,251, 67,250,161,  0,107, 97,241,111,181, 82,249, 33, 69, 55, // 5
        59,153, 29,  9,213,167, 84, 93, 30, 46, 94, 75,151,114, 73,222, //  6
        197, 96,210, 45, 16,227,248,202, 51,152,252,125, 81,206,215,186, //  7
        39,158,178,187,131,136,  1, 49, 50, 17,141, 91, 47,129, 60, 99, //  8
        154, 35, 86,171,105, 34, 38,200,147, 58, 77,118,173,246, 76,254, //  9
        133,232,196,144,198,124, 53,  4,108, 74,223,234,134,230,157,139, // 10
        189,205,199,128,176, 19,211,236,127,192,231, 70,233, 88,146, 44, // 11
        183,201, 22, 83, 13,214,116,109,159, 32, 95,226,140,220, 57, 12, // 12
        221, 31,209,182,143, 92,149,184,148, 62,113, 65, 37, 27,106,166, // 13
        3, 14,204, 72, 21, 41, 56, 66, 28,193, 40,217, 25, 54,179,117, // 14
        238, 87,240,155,180,170,242,212,191,163, 78,218,137,194,175,110, // 15