Analysis of the hashing methods.
Lets discuss hashing, hash functions, collision and different collision resolution techniques.
What is hashing?
Hashing is a technique used for storing and retrieving information quickly. It is used to perform optimal searches. It is done for faster access to elements. The efficiency of mapping depends on the efficiency of the hash function used.
Hash functions are used to map each key to a different address space, but practically it is not possible to create such a hash function and the problem is called collision.
What is this collision?
Collision is the condition where two records are stored in the same location or we can say that it is a situation that occurs when two distinct pieces of data have the same hash value.
How to resolve this collision?
The process of finding an alternate location is called collision resolution. Even though hash tables have collision problems, they are more efficient in many cases compared to all other data structures, like search trees. There are a number of collision resolution techniques, and the most popular are direct chaining and open addressing.
1. Direct Chaining :- A collision resolution technique in which the hash table is an array of linked lists. Each list contains all the items with the same hash value.
- Separate Chaining โ> Collision resolution by chaining combines linked representation with hash table. When two or more records hash to the same location, the records are constituted into a list called chain.
2. Open Addressing --> It is an array-based implementation. In open addressing all keys are stored in the hash table itself. This approach is also known as closed hashing and this procedure is based on probing. A collision is resolved by probing. Techniques used under this addressing are:
a.) Linear Probing (linear search) : In linear probing, we search the hash table sequentially, starting from the original hash location. If a location is occupied, we check the next location. The function for rehashing is the following:
rehash(key) = (n + 1)% tablesize
b.) Quadratic Probing (nonlinear search) : In quadratic probing, we start from the original hash location - i. If a location is occupied, we check the locations : (i + 12), (i +22 ), (i + 32), (i + 42).The function for rehashing is the following:
rehash(key) = (n + k2)% tablesize
c.) Double Hashing (use two hash functions) : It is a computer programming technique used in conjunction with open-addressing in hash tables to resolve hash collisions. In double hashing, the interval between probes is computed by another hash function. Double hashing reduces clustering in a better way than linear and quadric probing.
A basic overview (Comparison Chart)-