Skip to the main content

Original scientific paper

https://doi.org/10.17559/TV-20210128043310

A Machine Learning Classification Algorithm for Vocabulary Grading in Chinese Language Teaching

Yinbing Zhang ; 1) School of Artificial Intelligence, Beijing Normal University, No. 19, XinJieKouWai St., HaiDian District, Beijing 100875, P. R. China 2) School of Mathematical Science, Huaibei Normal University, No. 100, Dongshan Road, Huaibei, Anhui 235000, P. R. China
Jihua Song* ; School of Artificial Intelligence, Beijing Normal University, No. 19, XinJieKouWai St., HaiDian District, Beijing 100875, P. R. China
Weiming Peng* ; School of Artificial Intelligence, Beijing Normal University, No. 19, XinJieKouWai St., HaiDian District, Beijing 100875, P. R. China
Dongdong Guo ; School of Artificial Intelligence, Beijing Normal University, No. 19, XinJieKouWai St., HaiDian District, Beijing 100875, P. R. China
Tianbao Song ; School of Computer Science and Engineering, Beijing Technology and Business University, No. 11, Fucheng Road., HaiDian District, Beijing 100048, P. R. China


Full text: english pdf 1.365 Kb

page 845-855

downloads: 718

cite


Abstract

Vocabulary grading is of great importance in Chinese vocabulary teaching. This paper starts with an analysis of the lexical attributes that affect lexical complexity, followed by an explanation of the extraction of lexical attribute information combined with the constructed word-formation knowledge base, the construction of mapping functions corresponding to lexical attributes, and the quantitative representation of the attributes that form the basis for vocabulary grading. Based on this, a machine learning classification algorithm is creatively applied to the Chinese vocabulary grading problem. Using the comparative analysis of vocabulary grading models based on common machine learning classification algorithms, the importance measurement analysis of Chinese vocabulary attributes based on different feature selection methods is performed, and a vocabulary grading model is constructed based on the machine learning classification algorithm and feature importance selection of different feature selection algorithms. A comparison of the experimental results demonstrated that the classification model based on the support vector machine (SVM) algorithm and top six attribute groups by the importance of feature selection received the best effect. To improve vocabulary grading, a variety of feature selection algorithms were used to fuse the importance of lexical attributes on average. Then an experiment was conducted for vocabulary grading combined with the Bagging + SVM integration algorithm and top six attribute groups by the importance of feature selection. The experimental results demonstrated that the combination scheme achieved a better effect.

Keywords

integration algorithm; machine learning classification algorithm; vocabulary grading; word-formation knowledge base

Hrčak ID:

258210

URI

https://hrcak.srce.hr/258210

Publication date:

6.6.2021.

Visits: 1.594 *