My Photo
Name:
Location: New York, New York, United States

我叫江奕賢啦

Tuesday, January 16, 2007

pattern search

pattern search (seems like no limit)

http://bioportal.cgb.indiana.edu/cgi-bin/emboss/fuzzpro

[arndcqeghilkmfpstwyv]

>d1bwwa_ b.1.1.1 (A:) Immunoglobulin light chain kappa variable domain, VL-kappa {Human (Homo sapiens), cluster 1}
tpdiqmtqspsslsasvgdrvtitcqasqdiikylnwyqqkpgkapklliyeasnlqagv
psrfsgsgsgtdytftisslqpediatyycqqyqslpytfgqgtklqit
>d1eeqa_ b.1.1.1 (A:) Immunoglobulin light chain kappa variable domain, VL-kappa {Human (Homo sapiens), cluster 2}
divltqspdslavslgeratinckssqsvldssnsknylawyqqkpgqppklliywastr
esgvpdrfsgsgsgtdftltisslqaedvavyycqqyyshpysfgqgtkleik


it's not possible to report where is not match in the pattern,
because this is not a correct question.
consider
search 100
in 10110
where is the not-match accord?
usually the question will be, what's the minimal edit distance between two string.
but this question will not work in our case,
because dynamic programming actually list all the possibilities and calculate the 1-to-1 distances for each pair.
and since our case is not compare one string to one string, we compare one string with one pattern, and that pattern can have too many possible sequences.

pattern search algorithms can check here:

Pattern Discovery from Biosequences
ISBN 952-10-0792-3 (paperback)

ISBN 952-10-0819-9(PDF)
http://ethesis.helsinki.fi/julkaisut/mat/tieto/vk/vilo/patternd.pdf

p.67~p.76

0 Comments:

Post a Comment

<< Home