Skip to the main content

Original scientific paper

Bridging the gap between microarray technology and routine clinical diagnostics: a Random Forest approach to the gene expression profile dimensionality reduction

Željko Debeljak


Full text: croatian pdf 266 Kb

page 150-162

downloads: 600

cite

Full text: english pdf 266 Kb

page 150-162

downloads: 353

cite


Abstract

ntroduction: Although recognized as a valuable tool by scientific community, microarray based gene expression profiling has not accessed routine diagnostic application during the last decade. Since this approach is expensive and prone to substantial experimental variation, it is not suited for routine clinical diagnostic purposes at the current state of technology. In order to bridge that gap, different computational dimensionality reduction tools have been developed. The principle of their application is selection of a limited set of biomarker candidates from huge gene expression profiles appropriate for routine diagnostic assessment.
Aim: Random forest (RF) has been established as a reliable predictor. However, its relevant gene selection capabilities gained less attention. The aim of this study was to evaluate suitability of RF for biomarker selection from gene expression profile datasets. Three datasets taken from literature, obtained during small-scale clinical experiments, were chosen for that purpose.
Results: The results obtained show that RF could easily identify good uni-variate classifiers, i.e. single biomarkers when the problem at hand is of low complexity. For more complex problem a reliable two-dimensional classifier candidate could be also found by this approach. However, when the relationship between diagnosis/prognosis and gene expression profiling results are highly complex or the dataset is too small, RF-based dimensionality reduction fails to select a reliable set of biomarker candidates.
Conclusions: Within dataset complexity limitations, RF represents an appropriate tool for biomarker candidate selection.

Keywords

gene expression; microarray; biomarker screening; random forests; feature selection

Hrčak ID:

9648

URI

https://hrcak.srce.hr/9648

Publication date:

20.12.2006.

Article data in other languages: croatian

Visits: 2.385 *