Edit Distance

Edit Distance

In the edit distance problem, we have to find the minimum number of operations required to convert one string to another. It is a very famous interview problem asked many times in coding interviews as well as in coding rounds. In this article, we provide a c++ solution with an explanation of the problem.

Edit Distance Problem Desciption

Edit Distance Problem Description

We are given two strings A and B. We have to perform minimum number of operations on A to convert it to string B. The allowed operations are –

  • Insert a character.
  • Replace a character by any other character.
  • Delete a character.

Input:

  • First line contains string A.
  • Second line contains string B.

Output:

The minimum number of operations.

Edit Distance Problem Solution Explanation

Solution (Top Down):

To build a top down solution we must follow the following steps –

  • Break Down the Problem – The problem can be break down as follows. Consider a function F(i,j) that returns the minimum number of operations required to convert A[0…i] to B[0…j]. There are two options –

if(A[i]==B[j])

F(i,j) = F(i-1,j-1) // No operation is needed

else

Option 1 =1+ F(i,j-1) // Insert B[j] in A

Option 2 =1+ F(i-1,j) // Remove A[i]

Option 3 = 1 + F(i-1,j-1) // Replace A[i] with B[j]

Now we simply return minimum of all these options.

  • Find the base case – The base case in this problem is quite simple. It is when we have no characters left in any one string.
  • Check for optimal substructure and overlapping sub problems – It is easy to see that the problem can be broken down into smaller problems. The optimal substructure and overlapping sub problems properties can be visualized below.
Edit Distance subproblem

Bottom up Solution

In previous solution we call the function f(n-1,m-1) {n length of first string, m length of second string}.

Here instead of breaking problem into smaller problems we first solve smaller problems and iteratively use them to solve bigger problems.

Code

#include <bits/stdc++.h>
using namespace std;
int memo[105][105];
int f(string s1string s2int nint m)
{
    // Since the first string is empty, we insert all charaters of second string
    if (n == 0)
        return m;
    // Since the second string is empty, we remove all charaters of first string
    if (m == 0)
        return n;
    // We have a marching charator we ignore these positions
    if (s1[n – 1] == s2[m – 1])
        return f(s1s2n – 1m – 1);
    // When charaters are not matching
    int op1 = 1 + f(s1s2nm – 1);     // Insert Operation
    int op2 = 1 + f(s1s2n – 1m);     // Delete Operation
    int op3 = 1 + f(s1s2n – 1m – 1); // Replace
    return min(op1min(op2op3));
}
/*
dp table ->
   ” d e c e m b e r 
” 0  1 2 3 4 5 6 7 8
j  1  1 2 3 4 5 6 7 8
a  2  2 2 3 4 5 6 7 8
n  3  3 3 3 4 5 6 7 8
u  4  4 4 4 4 5 6 7 8
a  5  5 5 5 5 5 6 7 8
r  6  6 6 6 6 6 6 7 7
y  7  7 7 7 7 7 7 7 8
*/
int editDistanceBottomUp(string s1string s2)
{
    int n = s1.length();
    int m = s2.length();
    int dp[n + 1][m + 1];
    for (int i = 0i <= ni++)
    {
        for (int j = 0j <= mj++)
        {
            if (i == 0) // First String is Empty
                dp[i][j] = j;
            else if (j == 0) // Second String is Empty
                dp[i][j] = i;
            else if (s1[i – 1] == s2[j – 1]) // Matching characters
                dp[i][j] = dp[i – 1][j – 1];
            else // Not matching
                dp[i][j] = 1 + min(dp[i][j – 1], min(dp[i – 1][j], dp[i – 1][j – 1]));
        }
    }
    return dp[n][m];
}
int main()
{
    string s1 = “january”;
    string s2 = “december”;
    memset(memo, –1, sizeof memo);
    cout << “Top Down Answer “ << f(s1s2s1.length(), s2.length()) << endl;
    cout << “Bottom Up Answer “ << editDistanceBottomUp(s1s2<< endl;
    return 0;
}

Output

Top Down Answer 8
Bottom Up Answer 8