Subsequence
|
In mathematics, a subsequence of some sequence is a new sequence which is formed from the original sequence by deleting some of the elements without disturbing the relative positions of the remaining elements.
Formally, suppose that X is a set and that (ak)k ∈ K is a sequence in X, where K = {1,2,3,...,n} if (ak) is a finite sequence and K = N if (ak) is an infinite sequence. Then, a subsequence of (ak) is a sequence of the form
- <math> (a_{n_r}) <math>
where (nr) is a strictly increasing sequence in the index set K.
Example
As an example,
- <math> < B,C,D,B > <math>
is a subsequence of
- <math> < A,C,B,D,E,G,C,E,D,B,G > <math>,
with corresponding index sequence <3,7,9,10>.
Given two sequences X and Y, a sequence G is said to be a common subsequence of X and Y, if G is a subsequence of both X and Y. For example, if
- <math> X = < A,C,B,D,E,G,C,E,D,B,G > <math> and
- <math> Y = < B,E,G,C,F,E,U,B,K > <math>
then common subsequence of X and Y could be
- <math> G = < B,E,E > <math>
This would not be the longest common subsequence, since G only has length 3, and the common subsequence < B,E,E,B > has length 4. The longest common subsequence of X and Y is < B,E,G,C,E,B >
Applications
Subsequences have applications to computer science, especially in the discipline of Bioinformatics, where computers are used to compare, analyze, and store DNA strands.
Take two strands of DNA, say
ORG1 = ACGGTGTCGTGCTATGCTGATGCTGACTTATATGCTA
ORG2 = CGTTCGGCTATCGTACGTTCTATTCTATGATTTCTAA
Subsequences are used to determine how similar the two strands of DNA are, using the DNA bases: adenine, guanine, cytosine and thymine.