Preliminary communication
SOME APPROACHES TO TEXT MINING AND THEIR POTENTIAL FOR SEMANTIC WEB APPLICATIONS
Jan Paralič
orcid.org/0000-0002-4603-0411
; Technical University of Košice, Košice, Slovakia
Marek Paralič
; Technical University of Košice, Košice, Slovakia
Abstract
In this paper we describe some approaches to text mining, which are supported by an original software system developed in Java for support of information retrieval and text mining (JBowl), as well as its possible use in a distributed environment. The system JBowl1 is being developed as an open source software with the intention to provide an easily extensible, modular framework for pre-processing, indexing and further exploration of large text collections. The overall architecture of the system is described, followed by some typical use case scenarios, which have been used in some previous projects. Then, basic principles and technologies used for service-oriented computing, web services and semantic web services are presented. We further discuss how the JBowl system can be adopted into a distributed environment via technologies available already and what benefits can bring such an adaptation. This is in particular important in the context of a new integrated EU-funded project KP-Lab2 (Knowledge Practices Laboratory) that is briefly presented as well as the role of the proposed text mining services, which are currently being designed and developed there.
Keywords
Text mining; semantic web; service-oriented computing; web services; trialogical learning
Hrčak ID:
21474
URI
Publication date:
12.6.2007.
Visits: 2.686 *