Skip to the main content

Original scientific paper

A New Similarity/Diversity Measure for the Characterization of DNA Sequences

Roberto Todeschini ; Milano Chemometrics and QSAR Research Group, Department of Environmental Sciences, University of Milano-Bicocca, Milano, Italy
Davide Ballabio ; Milano Chemometrics and QSAR Research Group, Department of Environmental Sciences, University of Milano-Bicocca, Milano, Italy
Viviana Consonni ; Milano Chemometrics and QSAR Research Group, Department of Environmental Sciences, University of Milano-Bicocca, Milano, Italy
Andrea Mauri ; Milano Chemometrics and QSAR Research Group, Department of Environmental Sciences, University of Milano-Bicocca, Milano, Italy


Full text: english pdf 249 Kb

page 657-664

downloads: 694

cite


Abstract

In this paper, a new similarity/diversity measure is proposed as a new approach to the analysis
of sequential data, where useful information can be also obtained by the ordering relationships
between the sequence elements. This methodology has been applied to characterize DNA sequences,
evaluating their similarity/diversity. The new proposed distance (weighted standardized
Hasse distance) is evaluated between pairs of Hasse matrices derived from the classical
partial ordering rules. It can be naturally standardized, thus allowing the interpretation of these
distances as absolute values (e.g. percentage) and deriving simple similarity and correlation indices.
DNA sequences taken from the first exons of the beta-globins for eight different species
have been analyzed. Sensitivity analysis has been also performed, showing the high capability
of this measure to take into account small modifications of the DNA sequences. Finally, a comparison
with results obtained from literature is given.

Keywords

DNA; partial ordering; Hasse matrix; distances; similarity/diversity; rank correlation

Hrčak ID:

31193

URI

https://hrcak.srce.hr/31193

Publication date:

31.12.2008.

Visits: 1.481 *