Preliminary communication
https://doi.org/10.31341/jios.46.2.3
Understanding Document Thematic Structure: A Systematic Review of Topic Modeling Algorithms
Seun Osuntoki
; Department of Computer Science Faculty of Science, University of Lagos, Lagos, Nigeria
Victor Odumuyiwa
orcid.org/0000-0002-1050-892X
; Department of Computer Science Faculty of Science, University of Lagos, Lagos, Nigeria
Oladipupo Sennaike
; Department of Computer Science Faculty of Science, University of Lagos, Lagos, Nigeria
Abstract
The increasing usage of the Internet and other digital platforms has brought in the era of big data with the attending increase in the quantity of unstructured data that is available for processing and storage. However, the full benefits of analyzing this large quantity of unstructured data will not be realized without proper techniques and algorithms. Topic modeling algorithms have seen a major success in this area. Different topic modeling algorithms exist and each one either employs probabilistic or linear algebra approaches. Recent reviews on topic modeling algorithms dwell majorly on probabilistic methods without giving proper treatment to the linear-algebra-based algorithms. This review explores linear-algebra-based topic models as well as probability-based topic models. An overview of how models generated by each of these algorithms represent document thematic structure is also resented.
Keywords
Topic models; Information Retrieval; Text Mining; NMF; document structure
Hrčak ID:
290997
URI
Publication date:
22.12.2022.
Visits: 713 *