Skip to the main content

Original scientific paper

https://doi.org/10.7906/indecs.21.6.6

An Example of the Consistency Analysis of the Classification of Textual Materials by the Analyst and using the Naïve Bayesian Classifier

Josip Ježovita orcid id orcid.org/0000-0003-0165-798X ; Catholic University of Croatia, Zagreb, Croatia *
Mateja Plenković ; Catholic University of Croatia, Zagreb, Croatia
Nika Đuho ; Catholic University of Croatia, Zagreb, Croatia

* Corresponding author.


Full text: english pdf 395 Kb

page 607-622

downloads: 160

cite


Abstract

Sentiment analysis is a particular form of content analysis, and its application has become popular with the growth of Internet platforms where a wide range of content is generated. Today, various classifiers use for sentiment analysis, and in this article, we show an example of using a Naïve Bayesian classifier. The aim is to examine the consistency of classifying textual materials into a positive, negative or neutral tone by analysts and the Bayesian algorithm. The hypotheses are that there is an increase in the agreement between the two ways of classifying textual materials as (1) the complexity of the formulations and (2) the size of the learning datasets increases. Based on the results, both hypotheses were accepted, but only on certain groups of messages. Increasing the size of the learning datasets and increasing the complexity of the formulations helped the classification accuracy for messages in a positive tone, while the classification accuracy for messages in other tones was high and equal regardless of varying the parameters. Correlation analysis showed a high positive correlation between the outcomes the Bayesian algorithm classified and the tones the analyst determined (r = 0,816).

Keywords

content analysis; sentiment analysis; naïve Bayes classifier

Hrčak ID:

312405

URI

https://hrcak.srce.hr/312405

Publication date:

28.12.2023.

Visits: 400 *