Background on Dynamic Programming |
|
The Longest Common Subseqence Problem
A subsequence of an n-character sequence, S, is obtained by deleting 0,1,2, or more characters
from S. Here is an example of some subsequences of the word, residency: res, sdncy,
reden, and many more. Given two sequences, A = [a1, a2, ..., am ]
and B = [b1, b2, ..., bn], len(i, j) denotes the longest subsequence
common to [a1, a2, ..., ai ] and [b1, b2, ..., bj], (note the subscripts, j <= m and i <= n).
The dynamic programming formulation of the problem can be shown as a recurrence relation:
Computation of the Longest Common Subsequence
Here is the pseudocode (from [1]) for an O(mn) longest common subsequence script:
for(i = 0 to i = m) do length(i, 0) = 0
for (i = l to i = n) do length(0, j) = 0
for (i = l to i = m) do
{
for (j = l to j = n) do
{
if (a[i] = b[j])
{
length(i, j) = length(i - 1, j - 1) + 1
traceback(i, j) = diag;
}
else if( length(i - 1, j) >= length(i, j - 1)
{
length(i, j) = length(i - 1, j)
traceback(i, j) = vertical
}
else
{
length(i, j) = length(i, j - 1)
traceback(i, j) = horizontal
}
}
}
return length and traceback
Common Subsequences between residency and redundancy
|
|
Sources:
[1] Chao, K.-M., "Dynamic Programming Strategies for Analyzing Biomolecular Sequences." Selected
Topics in Post-Genome Knowledge Discovery. Eds. Wong, L. and Zhang, L., Singapore University Press. 2004.
|
|