Skoči na glavni sadržaj

Prethodno priopćenje

https://doi.org/10.31341/jios.46.2.3

Understanding Document Thematic Structure: A Systematic Review of Topic Modeling Algorithms

Seun Osuntoki ; Department of Computer Science Faculty of Science, University of Lagos, Lagos, Nigeria
Victor Odumuyiwa orcid id orcid.org/0000-0002-1050-892X ; Department of Computer Science Faculty of Science, University of Lagos, Lagos, Nigeria
Oladipupo Sennaike ; Department of Computer Science Faculty of Science, University of Lagos, Lagos, Nigeria


Puni tekst: engleski pdf 634 Kb

str. 305-322

preuzimanja: 138

citiraj


Sažetak

The increasing usage of the Internet and other digital platforms has brought in the era of big data with the attending increase in the quantity of unstructured data that is available for processing and storage. However, the full benefits of analyzing this large quantity of unstructured data will not be realized without proper techniques and algorithms. Topic modeling algorithms have seen a major success in this area. Different topic modeling algorithms exist and each one either employs probabilistic or linear algebra approaches. Recent reviews on topic modeling algorithms dwell majorly on probabilistic methods without giving proper treatment to the linear-algebra-based algorithms. This review explores linear-algebra-based topic models as well as probability-based topic models. An overview of how models generated by each of these algorithms represent document thematic structure is also resented.

Ključne riječi

Topic models; Information Retrieval; Text Mining; NMF; document structure

Hrčak ID:

290997

URI

https://hrcak.srce.hr/290997

Datum izdavanja:

22.12.2022.

Posjeta: 312 *