Skoči na glavni sadržaj

Izvorni znanstveni članak

https://doi.org/10.17559/TV-20231011001013

An Empirical Study on Document Similarity Comparison Evaluation Between Machine Learning Techniques and Human Experts

Won-Jung Jang ; Catholic Kwandong University, 25601 #502, The Mary Hall, 24, Beomil-ro 576, Gangneung-si, Gangwon-do, South Korea *

* Autor za dopisivanje.


Puni tekst: engleski pdf 1.848 Kb

str. 1668-1679

preuzimanja: 0

citiraj


Sažetak

Current machine-learning training focuses solely on accuracy. In this study, the weights of other dimensions were examined rather than measuring only the accuracy of machine learning. By comparatively analyzing the decision-making of machine learning and humans in various fields, this study examines how well organizational vision is propagated to lower levels of the organization. Also, the results evaluated by humans and machine learning models were comparatively analyzed from multiple perspectives. As numerical representation methods of words, count-based models (Bag of Words, TF-IDF), artificial neural network (ANN) models (Word2Vec, GloVe), and a vision propagation measurement (VPMS) model combining two methods were used to calculate the similarity between documents, which are comparatively analyzed with the actual results measured by an expert group. The findings of this study can be used as an evaluation metric for how effectively the vision of the upper organization is being disseminated to the lower-level organizations. Additionally, it could be utilized in developing algorithms such as customer segmentation for target marketing using text data.The study makes two key contributions - (i) providing an extensive empirical comparison of document similarity analysis by different ML techniques versus human experts, and (ii) proposing a new VPMS model that outperforms existing methods.

Ključne riječi

ANN model; count-based model; document similarity; ensemble learning model; machine learning

Hrčak ID:

320403

URI

https://hrcak.srce.hr/320403

Datum izdavanja:

31.8.2024.

Posjeta: 0 *