Tehnički vjesnik, Vol. 28 No. 1, 2021.
Izvorni znanstveni članak
https://doi.org/10.17559/TV-20190506102016
An Online Word Vector Generation Method Based on Incremental Huffman Tree Merging
Kui Qian*
; School of Automation, Nanjing Institute of Technology, No.1 Hongjing Avenue, Jiangning District, Nanjing, Jiangsu Province, China
Lei Tian
; School of Automation, Nanjing Institute of Technology, No.1 Hongjing Avenue, Jiangning District, Nanjing, Jiangsu Province, China
Xiulan Wen
; School of Automation, Nanjing Institute of Technology, No.1 Hongjing Avenue, Jiangning District, Nanjing, Jiangsu Province, China
Zhenzhong Song
; Shanghai Electromechanical Engineering Institute, No. 3888 Yuanjiang Road, Minhang District, Shanghai, China
Sažetak
Aiming at high real-time performance processing requirements for large amounts of online text data in natural language processing applications, an online word vector model generation method based on incremental Huffman tree merging is proposed. Maintaining the inherited word Huffman tree in existing word vector model unchanged, a new Huffman tree of incoming words is constructed and ensures that there is no leaf node identical to the inherited Huffman tree. Then the Huffman tree is updated by a method of node merging. Thus based on the existing word vector model, each word still has a unique encoding for the calculation of the hierarchical softmax model. Finally, the generation of incremental word vector model is realized by using neural network on the basis of hierarchical softmax model. The experimental results show that the method could realize the word vector model generation online based on incremental learning with faster time and better performance.
Ključne riječi
Huffman tree; hierarchical softmax; incremental learning; neural network; online word vector
Hrčak ID:
250714
URI
Datum izdavanja:
5.2.2021.
Posjeta: 1.715 *