Skip to the main content

Original scientific paper

https://doi.org/10.17559/TV-20190506102016

An Online Word Vector Generation Method Based on Incremental Huffman Tree Merging

Kui Qian* ; School of Automation, Nanjing Institute of Technology, No.1 Hongjing Avenue, Jiangning District, Nanjing, Jiangsu Province, China
Lei Tian ; School of Automation, Nanjing Institute of Technology, No.1 Hongjing Avenue, Jiangning District, Nanjing, Jiangsu Province, China
Xiulan Wen ; School of Automation, Nanjing Institute of Technology, No.1 Hongjing Avenue, Jiangning District, Nanjing, Jiangsu Province, China
Zhenzhong Song ; Shanghai Electromechanical Engineering Institute, No. 3888 Yuanjiang Road, Minhang District, Shanghai, China


Full text: english pdf 1.214 Kb

page 52-57

downloads: 798

cite


Abstract

Aiming at high real-time performance processing requirements for large amounts of online text data in natural language processing applications, an online word vector model generation method based on incremental Huffman tree merging is proposed. Maintaining the inherited word Huffman tree in existing word vector model unchanged, a new Huffman tree of incoming words is constructed and ensures that there is no leaf node identical to the inherited Huffman tree. Then the Huffman tree is updated by a method of node merging. Thus based on the existing word vector model, each word still has a unique encoding for the calculation of the hierarchical softmax model. Finally, the generation of incremental word vector model is realized by using neural network on the basis of hierarchical softmax model. The experimental results show that the method could realize the word vector model generation online based on incremental learning with faster time and better performance.

Keywords

Huffman tree; hierarchical softmax; incremental learning; neural network; online word vector

Hrčak ID:

250714

URI

https://hrcak.srce.hr/250714

Publication date:

5.2.2021.

Visits: 1.715 *