Skoči na glavni sadržaj

Prethodno priopćenje

https://doi.org/10.31803/tg-20221228180808

Text Classification of Mixed Model Based on Deep Learning

Sang-Hwa Lee ; Department of Webtoon Contents, Seowon University, 377-3 Musimseo-ro, Seowon-gu, Cheongju-si, Chungcheongbuk-do, 28674, Republic of Korea


Puni tekst: hrvatski pdf 822 Kb

str. 367-374

preuzimanja: 138

citiraj


Sažetak

At present, deep learning has been widely used many fields, but the research on text classification is still relatively few. This paper makes full use of the good learning characteristics of deep learning, proposes a hybrid model based on deep learning, and designs a text classifier based on the hybrid model. This hybrid model uses two common deep learning models, sparse automatic encoder and deep confidence network, to mix. The hybrid model is mainly composed of three parts, the first two layers are constructed by sparse automatic encoder, the middle layer is a three-layer depth Convolutional Neural Network (CNN), and finally Softmax regression is used as the classification layer. In order to test the classification performance of the classifier based on deep learning hybrid model, relevant experiments were conducted on English data set 20Newsgroup and Chinese data set Fudan University Chinese Corpus. In the English text classification experiment, the classifier based on deep learning hybrid model is used to classify, and a high classification accuracy rate is obtained. In order to further verify the superiority of its performance, a comparative experiment with naive Bayes classifier, K-Nearest Neighbor (KNN) classifier and Support Vector Machine (SVM) classifier demonstrates that the classification effect of the classifier based on deep learning hybrid model is better than that of naive Bayes classifier, KNN classifier and support vector machine classifier. In the experiment of Chinese text classification, the Chinese corpus of Fudan University is tested, and a good classification effect is obtained. The influence of different parameter settings on the classification accuracy is discussed.

Ključne riječi

classification; deep confidence network; deep learning; sparse automatic encoder; softmax

Hrčak ID:

306117

URI

https://hrcak.srce.hr/306117

Datum izdavanja:

15.9.2023.

Posjeta: 320 *