Pregledni rad
HOW TO APPROACH DATA ANALYSIS OF TEXTS
Dunja Mladenič
; J.Stefan Institute, Ljubljana, Slovenia
Sažetak
Analysis of large text data sets is gaining popularity providing the users some insights into their own (potentially even very unstructured) data sets that where difficult to get using the standard methods. This kind of data analysis differs from the standard analysis in the following three directions: (1) the used methods for data analysis differ from the standard statistical methods, (2) the data we are analyzing have different characteristics than the standard, structured data bases, and (3) the users of the data analysis results have different needs and requirements than the usual users of common analytical services (statistics, data-mining, OLAP). This paper gives a brief idea of the area addressing that kind of data analysis commonly referred to as Text-Mining. It is a growing area placed at the intersection of Information-Retrival (IR), Data-Mining (DM), Machine-Learning (ML), Natural-Language-Processing (NLP). The problems usually addressed in Text-Mining are topic detection and tracking, document categorization, visualization of document collections, user profiling, information extraction, construction and updating of hierarchical indices and document collections, intelligent search.
Ključne riječi
text data analysis; data mining; example applications of text mining; personalized information delivery
Hrčak ID:
78313
URI
Datum izdavanja:
15.12.2004.
Posjeta: 1.135 *