© 1996 by Kazusa DNA Research Institute
Enlarged Similarity of Nucleic Acid Sequences
117312, Bioengineering Center, Russian Academy of Sciences 60 Oktybrya prospect, 7/1, Moscow, Russia and Department of Cybernetics, Moscow Physical Engineering Institute 115409, Kashirskoe chosse, 31, Moscow, Russia
* To whom correspondence should be addressed. Tel. +7-095-430-32-70, Fax. +7-095-135-05-71, E-mail: korotkov{at}biengi.msk.su
The concept of nucleic acid sequence base alternations is presented. The number of base alterations for the sequences of different length is established. The definition of "enlarged similarity" of nucleic acids sequences on the basis of sequence base alterations is introduced. Mutual information between sequences is used as a quantitative measure of enlarged similarity for two compared sequences. The method of mutual information calculation is developed considering the correlation of bases in compared sequences. The definitions of correlated similarity and evolution similarity between compared sequences are given. Results of the use of enlarged similarity approach for DNA sequences analysis are discussed.
Key words: DNA sequence; computer analysis; sequence base alteration; mutual information; enlarged similarity