Tehnički glasnik, Vol. 17 No. 3, 2023.
Prethodno priopćenje
https://doi.org/10.31803/tg-20221228180808
Text Classification of Mixed Model Based on Deep Learning
Sang-Hwa Lee
; Department of Webtoon Contents, Seowon University, 377-3 Musimseo-ro, Seowon-gu, Cheongju-si, Chungcheongbuk-do, 28674, Republic of Korea
Sažetak
At present, deep learning has been widely used many fields, but the research on text classification is still relatively few. This paper makes full use of the good learning characteristics of deep learning, proposes a hybrid model based on deep learning, and designs a text classifier based on the hybrid model. This hybrid model uses two common deep learning models, sparse automatic encoder and deep confidence network, to mix. The hybrid model is mainly composed of three parts, the first two layers are constructed by sparse automatic encoder, the middle layer is a three-layer depth Convolutional Neural Network (CNN), and finally Softmax regression is used as the classification layer. In order to test the classification performance of the classifier based on deep learning hybrid model, relevant experiments were conducted on English data set 20Newsgroup and Chinese data set Fudan University Chinese Corpus. In the English text classification experiment, the classifier based on deep learning hybrid model is used to classify, and a high classification accuracy rate is obtained. In order to further verify the superiority of its performance, a comparative experiment with naive Bayes classifier, K-Nearest Neighbor (KNN) classifier and Support Vector Machine (SVM) classifier demonstrates that the classification effect of the classifier based on deep learning hybrid model is better than that of naive Bayes classifier, KNN classifier and support vector machine classifier. In the experiment of Chinese text classification, the Chinese corpus of Fudan University is tested, and a good classification effect is obtained. The influence of different parameter settings on the classification accuracy is discussed.
Ključne riječi
classification; deep confidence network; deep learning; sparse automatic encoder; softmax
Hrčak ID:
306117
URI
Datum izdavanja:
15.9.2023.
Posjeta: 710 *