Rabin Karp Algo C++

Rabin Karp Algorithm (Rolling Hash)

Rabin-Karp string searching algorithm calculates a numerical (hash) value for the pattern p, and for each m-character substring of text t. Then it compares the numerical values instead of comparing the actual symbols. If any match is found, it compares the pattern with the substring by naive approach.

Lets understand this with the help of an example:- 

  • Input:  aaaaab
    Output: aab

First will calculate the hashcode using hashfunction and iterate throughout the input string and find if the same hashcode is present if its present then we will check for the actual pattern.

Rabin Karp Algorithm 
Rabin karp algorithm

Algorithm

  • It’s based on rolling hash. Suppose we have a hash function that take string as input  and it gives a unique hash.
  • First we need to create a hash using hashfunction. 
  • Lets understand this using example:- 
  • Main string – abcde
  • Pattern to find – abc 
  • To calculate the hash of the next substring, subtract the last most value and add the hash value of the new charater. 
  • hash(abc) = 97 * 30 + 98*3+ 99 * 3 2    
  • Subtract the first ASCII character 
  • hash(bcd) = 97 * 30 + 98*3+ 99 * 3 2    – 99

                              =  98*3+ 99 * 3 2

  • Divide the whole by the prime number 

          =  (98*3+ 99 * 3 2) / 3 

          =  98*3+ 99 * 3 1

  • Multiply the ASCII of the last character with the power of prime raise to the power length of the substring -1 and then add it to the new hash.

          =  98*3+ 99 * 3 1 + 100 * 3 2

  • And this will be continued to find all new hashes.

C++ Code :-

#include <bits/stdc++.h>
#define ll long long int
#define prime 119
using namespace std;

ll createhash(string str, int n) {
   ll result = 0;
   for (int i = 0; i < n; i++) {
       result += (ll)(str[i] * (ll)pow(prime, i));
   }
   return result;
}
ll recalculate_hash(string str, int oldindex, int newindex, ll oldhash, int pathlength) {
   ll newhash = oldhash - str[oldindex];
   newhash = newhash / prime;
   newhash += (ll)(str[newindex] * (ll)(pow(prime, pathlength - 1)));
   return newhash;
}
bool checkequal(string str1, string str2, int start1, int end1, int start2, int end2) {
   if (end1 - start1 != end2 - start2) return false;
   while (start1 <= end1 and start2 <= end2) {
       if (str1[start1] != str2[start2]) return false;
       start1++;
       start2++;
   }
   return true;
}
int main() {
   string str = "ababcabdab";
   string pat = "abd";
   // this will give the initial hash value of the string and the pattern
   ll pathash createhash(pat, pat.length());
   ll strhash createhash(str, pat.length());
   for (int i = 0; i <= str.length() - pat.length() - 1; i++) {
       if (pathash == strhash and
           checkequal(str, pat, i, i + pat.length() - 1, 0, pat.length() - 1)) {
           cout << i << endl;
       }
       if (i < str.length() - pat.length()) {
           strhash = recalculate_hash(str, i, i + pat.length(), strhash, pat.length());
       }
   }
   return 0;
}

Output:-

5