Bard College – Computer Science – Data Structures
In this lab, we will explore hash functions. You are provided with three different kinds of hash functions: modular hashing, Pearson hashing, and Java 1.1 String hashing; and three different files: a dictionary, a novel, and DNA strings(you are encouraged to try your own file). Your task is to find which hash function(s) work well for those applications.
The below images show how different keys are distributed over a hash-table. The second is more uniformly distributed, with fewer hash collisions, and thus preferred. Neither is perfect.
Hash Functions
Modular Hashing
The first kind of hash function we will consider is modular hashing(whatr works well?):
privatestaticintmhash(String k, int r){
int h = 0;
for (int i = 0; i < k.length(); i ++){
h = (h * r) + k.charAt(i);
}
return h;
}
Pearson Hashing
Next, we'll consider something known as Pearson hashing[see the reading handed out in class]:
privatestaticintphash(String k){
int h = 0;
for (int i = 0; i < k.length(); i++) {
int index = h ^ k.charAt(i);
h = T[index];
}
return h;
}
Pearson hashing assumes a table of 256 permutations is available, for example:
Hash Functions
private static int mhash(String k, int r){
int h = 0;
for (int i = 0; i < k.length(); i ++){
h = (h * r) + k.charAt(i);
}
return h;
}
private static int phash(String k){
int h = 0;
for (int i = 0; i < k.length(); i++) {
int index = h ^ k.charAt(i);
h = T[index];
}
return h;
}
final static int[] T = {
98, 6, 85,150, 36, 23,112,164,135,207,169, 5, 26, 64,165,219, // 1
61, 20, 68, 89,130, 63, 52,102, 24,229,132,245, 80,216,195,115, // 2
90,168,156,203,177,120, 2,190,188, 7,100,185,174,243,162, 10, // 3
237, 18,253,225, 8,208,172,244,255,126,101, 79,145,235,228,121, // 4
123,251, 67,250,161, 0,107, 97,241,111,181, 82,249, 33, 69, 55, // 5
59,153, 29, 9,213,167, 84, 93, 30, 46, 94, 75,151,114, 73,222, // 6
197, 96,210, 45, 16,227,248,202, 51,152,252,125, 81,206,215,186, // 7
39,158,178,187,131,136, 1, 49, 50, 17,141, 91, 47,129, 60, 99, // 8
154, 35, 86,171,105, 34, 38,200,147, 58, 77,118,173,246, 76,254, // 9
133,232,196,144,198,124, 53, 4,108, 74,223,234,134,230,157,139, // 10
189,205,199,128,176, 19,211,236,127,192,231, 70,233, 88,146, 44, // 11
183,201, 22, 83, 13,214,116,109,159, 32, 95,226,140,220, 57, 12, // 12
221, 31,209,182,143, 92,149,184,148, 62,113, 65, 37, 27,106,166, // 13
3, 14,204, 72, 21, 41, 56, 66, 28,193, 40,217, 25, 54,179,117, // 14
238, 87,240,155,180,170,242,212,191,163, 78,218,137,194,175,110, // 15