Longest Common Subsequence

Longest Common Subsequence

In Longest Common Subsequence we have to find the common subsequence of two strings. It a very famous question and it is asked many time in coding tests and interviews. In this article we provide a c++ implementation with Explanation to the problem.

Longest Common Subsequence Problem Description

Longest Common Subsequence Problem Description

In this problem we are given two input strings consisting of English alphabets. We have to return the length of longest common subsequence possible. A subsequence is a string that is formed by removing any number of characters (including 0) from the input string.

Input:

  • First line containing input string1.
  • Second line containing input string2.

Output:

  • Longest length required.

Longest Common Subsequence Solution

Solution 1 (Memoization):

  • Break Down the Problem – In this step we try to break down the problem into smaller problems and assume recursion will solve the smaller problems.

F(i,j) = Length of longest common subsequence of s1 till i and s2 till j.

 Now there are two cases possible:

F(i,j) = 1+ F(i-1,j-1)   { if s1[i]==s2[j]}

Or

F(i,j) = max(F(i-1,j),F(i,j-1)) { if s1[i]!=s2[j]}

Longest Common Subsequence Problem Breakdown
  • Find the base case –  What can be base case? It’s easy, find the solution to smallest known problem. Here we know solution when we reach the end of any strings ie we have no characters to be match we simply return 0.
  • Check for optimal substructure and overlapping sub problems – It is easy to see that the problem can be broken down into smaller problems. The optimal substructure and overlapping sub problems properties can be visualized below.
Longest Common Subsequence sub problems

Code

#include <bits/stdc++.h>
using namespace std;
const int N = 1005;
string ab;
int memo[N][N];
int lcs(int iint j)
{
    if (i == 0 || j == 0) // If we reached the end of any one of string
        return 0;

    if (memo[i][j] != –1) // Checking if we have calculated the subproblem previously
        return memo[i][j];

    if (a[i – 1] == b[j – 1]) // If charactors are same
        return 1 + lcs(i – 1j – 1);
    // If charactors are not same
    int op1 = lcs(i – 1j);
    int op2 = lcs(ij – 1);

    return memo[i][j] = max(op1op2);
}

int main()
{
    cin >> a >> b;
    memset(memo, –1, sizeof memo);
    cout << lcs(a.length(), b.length()) << endl;

    return 0;
}

Output

axbca
aybza
3

Solution (Bottom Up Dynamic Programming)

In above solution we created a function that recursively solves smaller problems. Here we create a dp table and solve for each i,j starting from (0,0) to (s1.length(),s2.length()). Refer to code for more details.

Code

#include <bits/stdc++.h>
using namespace std;
const int N = 1005;
string ab;
/*
    dp table
    ‘ ‘  a  x  b  c  a
 ‘ ‘ 0   0  0  0  0  0
  a  0   1  1  1  1  1
  y  0
  b  0
  z  0
  a  0
*/
int lcs()
{
    int n = a.length();
    int m = b.length();
    int dp[n + 1][m + 1];
    for (int i = 0i <= ni++)
    {
        for (int j = 0j <= mj++)
        {
            if (i == 0 || j == 0) // If one string is empty
            {
                dp[i][j] = 0;
            }
            else if (a[i – 1] == b[j – 1]) // If charactors are same
            {
                dp[i][j] = 1 + dp[i – 1][j – 1];
            }
            else // If charactors are not same
            {
                dp[i][j] = max(dp[i][j – 1], dp[i – 1][j]);
            }
        }
    }
    return dp[n][m];
}
int main()
{
    cin >> a >> b;
    cout << lcs() << endl;

    return 0;
}

Output

axbca
aybza
3