Croatica Chemica Acta, Vol. 81 No. 4, 2008.
Original scientific paper
A New Similarity/Diversity Measure for the Characterization of DNA Sequences
Roberto Todeschini
; Milano Chemometrics and QSAR Research Group, Department of Environmental Sciences, University of Milano-Bicocca, Milano, Italy
Davide Ballabio
; Milano Chemometrics and QSAR Research Group, Department of Environmental Sciences, University of Milano-Bicocca, Milano, Italy
Viviana Consonni
; Milano Chemometrics and QSAR Research Group, Department of Environmental Sciences, University of Milano-Bicocca, Milano, Italy
Andrea Mauri
; Milano Chemometrics and QSAR Research Group, Department of Environmental Sciences, University of Milano-Bicocca, Milano, Italy
Abstract
In this paper, a new similarity/diversity measure is proposed as a new approach to the analysis
of sequential data, where useful information can be also obtained by the ordering relationships
between the sequence elements. This methodology has been applied to characterize DNA sequences,
evaluating their similarity/diversity. The new proposed distance (weighted standardized
Hasse distance) is evaluated between pairs of Hasse matrices derived from the classical
partial ordering rules. It can be naturally standardized, thus allowing the interpretation of these
distances as absolute values (e.g. percentage) and deriving simple similarity and correlation indices.
DNA sequences taken from the first exons of the beta-globins for eight different species
have been analyzed. Sensitivity analysis has been also performed, showing the high capability
of this measure to take into account small modifications of the DNA sequences. Finally, a comparison
with results obtained from literature is given.
Keywords
DNA; partial ordering; Hasse matrix; distances; similarity/diversity; rank correlation
Hrčak ID:
31193
URI
Publication date:
31.12.2008.
Visits: 1.481 *