Technical gazette, Vol. 31 No. 5, 2024.
Original scientific paper
https://doi.org/10.17559/TV-20231011001013
An Empirical Study on Document Similarity Comparison Evaluation Between Machine Learning Techniques and Human Experts
Won-Jung Jang
; Catholic Kwandong University, 25601 #502, The Mary Hall, 24, Beomil-ro 576, Gangneung-si, Gangwon-do, South Korea
*
* Corresponding author.
Abstract
Current machine-learning training focuses solely on accuracy. In this study, the weights of other dimensions were examined rather than measuring only the accuracy of machine learning. By comparatively analyzing the decision-making of machine learning and humans in various fields, this study examines how well organizational vision is propagated to lower levels of the organization. Also, the results evaluated by humans and machine learning models were comparatively analyzed from multiple perspectives. As numerical representation methods of words, count-based models (Bag of Words, TF-IDF), artificial neural network (ANN) models (Word2Vec, GloVe), and a vision propagation measurement (VPMS) model combining two methods were used to calculate the similarity between documents, which are comparatively analyzed with the actual results measured by an expert group. The findings of this study can be used as an evaluation metric for how effectively the vision of the upper organization is being disseminated to the lower-level organizations. Additionally, it could be utilized in developing algorithms such as customer segmentation for target marketing using text data.The study makes two key contributions - (i) providing an extensive empirical comparison of document similarity analysis by different ML techniques versus human experts, and (ii) proposing a new VPMS model that outperforms existing methods.
Keywords
ANN model; count-based model; document similarity; ensemble learning model; machine learning
Hrčak ID:
320403
URI
Publication date:
31.8.2024.
Visits: 185 *