Bard College – Computer Science – Data Structures
In this lab, we will explore 8-bit hash functions and hash tables. 8-bit hash functions have applications in resource constrained environments, and are also easier to fully consider than the 32-bit hash functions we discussed in class.[2^8(256) is smaller than 2^32(4294967296)].
Your task is to find hash function(s) that work well. You are provided with four different domains: random strings, random words from a dictionary, random words from a short story, and random DNA strings; but, you are encouraged to think of your own domain.
The below images show how different keys are distributed over a hash-table of size 256. The second is more uniformly distributed, with less hash collisions, and thus preferred.
Hash Functions
Modular Hashing
The first kind of hash function we will consider is modular hashing(similar to the 32-bit variable length hash function in the text, but limited to 8 bits):
privatestaticintmhash(String k, int r){
int h = 0;
for (int i = 0; i < k.length(); i ++){
h = (h * r) + k.charAt(i);
}
return (h & 0xFF); // trim to 8-bits
}
Pearson Hashing
Next, we'll consider something known as Pearson hashing[see the reading handed out in class]:
privatestaticintphash(String k){
int h = 0;
for (int i = 0; i < k.length(); i++) {
int index = h ^ k.charAt(i);
h = T[index];
}
return (h & 0xFF); // trim to 8-bits
}
Pearson hashing assumes a table of 256 permutations is available, for example:
Hash Functions
private static int mhash(String k, int r){
int h = 0;
for (int i = 0; i < k.length(); i ++){
h = (h * r) + k.charAt(i);
}
return (h & 0xFF); // trim to 8-bits
}
private static int phash(String k){
int h = 0;
for (int i = 0; i < k.length(); i++) {
int index = h ^ k.charAt(i);
h = T[index];
}
return (h & 0xFF); // trim to 8-bits
}
final static int[] T = {
98, 6, 85,150, 36, 23,112,164,135,207,169, 5, 26, 64,165,219, // 1
61, 20, 68, 89,130, 63, 52,102, 24,229,132,245, 80,216,195,115, // 2
90,168,156,203,177,120, 2,190,188, 7,100,185,174,243,162, 10, // 3
237, 18,253,225, 8,208,172,244,255,126,101, 79,145,235,228,121, // 4
123,251, 67,250,161, 0,107, 97,241,111,181, 82,249, 33, 69, 55, // 5
59,153, 29, 9,213,167, 84, 93, 30, 46, 94, 75,151,114, 73,222, // 6
197, 96,210, 45, 16,227,248,202, 51,152,252,125, 81,206,215,186, // 7
39,158,178,187,131,136, 1, 49, 50, 17,141, 91, 47,129, 60, 99, // 8
154, 35, 86,171,105, 34, 38,200,147, 58, 77,118,173,246, 76,254, // 9
133,232,196,144,198,124, 53, 4,108, 74,223,234,134,230,157,139, // 10
189,205,199,128,176, 19,211,236,127,192,231, 70,233, 88,146, 44, // 11
183,201, 22, 83, 13,214,116,109,159, 32, 95,226,140,220, 57, 12, // 12
221, 31,209,182,143, 92,149,184,148, 62,113, 65, 37, 27,106,166, // 13