Improving Machine Translation Quality with Denoising Autoencoder and Pre-Ordering

Hong-Viet, Tran; Van-Vinh, Nguyen; Hoang-Quan, Nguyen

doi:10.20532/cit.2021.1005316

Journal of computing and information technology, Vol. 29 No. 1, 2021.

Izvorni znanstveni članak

https://doi.org/10.20532/cit.2021.1005316

Improving Machine Translation Quality with Denoising Autoencoder and Pre-Ordering

Tran Hong-Viet orcid.org/0000-0002-4675-2123 ; University of Economic and Technical Industries, Hanoi, Vietnam
Nguyen Van-Vinh ; University of Engineering Technology, Vietnamese National University, Hanoi, Vietnam
Nguyen Hoang-Quan ; University of Engineering Technology, Vietnamese National University, Hanoi, Vietnam

Puni tekst: engleski pdf 2.390 Kb

str. 39-56

preuzimanja: 364

citiraj

APA 6th Edition

Hong-Viet, T., Van-Vinh, N. i Hoang-Quan, N. (2021). Improving Machine Translation Quality with Denoising Autoencoder and Pre-Ordering. Journal of computing and information technology, 29 (1), 39-56. https://doi.org/10.20532/cit.2021.1005316

MLA 8th Edition

Hong-Viet, Tran, et al. "Improving Machine Translation Quality with Denoising Autoencoder and Pre-Ordering." Journal of computing and information technology, vol. 29, br. 1, 2021, str. 39-56. https://doi.org/10.20532/cit.2021.1005316. Citirano 26.11.2024.

Chicago 17th Edition

Hong-Viet, Tran, Nguyen Van-Vinh i Nguyen Hoang-Quan. "Improving Machine Translation Quality with Denoising Autoencoder and Pre-Ordering." Journal of computing and information technology 29, br. 1 (2021): 39-56. https://doi.org/10.20532/cit.2021.1005316

Harvard

Hong-Viet, T., Van-Vinh, N., i Hoang-Quan, N. (2021). 'Improving Machine Translation Quality with Denoising Autoencoder and Pre-Ordering', Journal of computing and information technology, 29(1), str. 39-56. https://doi.org/10.20532/cit.2021.1005316

Vancouver

Hong-Viet T, Van-Vinh N, Hoang-Quan N. Improving Machine Translation Quality with Denoising Autoencoder and Pre-Ordering. Journal of computing and information technology [Internet]. 2021 [pristupljeno 26.11.2024.];29(1):39-56. https://doi.org/10.20532/cit.2021.1005316

IEEE

T. Hong-Viet, N. Van-Vinh i N. Hoang-Quan, "Improving Machine Translation Quality with Denoising Autoencoder and Pre-Ordering", Journal of computing and information technology, vol.29, br. 1, str. 39-56, 2021. [Online]. https://doi.org/10.20532/cit.2021.1005316

Sažetak

The problems in machine translation are related to the characteristics of a family of languages, especially syntactic divergences between languages. In the translation task, having both source and target languages in the same language family is a luxury that cannot be relied upon. The trained models for the task must overcome such differences either through manual augmentations or automatically inferred capacity built into the model design. In this work, we investigated the impact of multiple methods of differing word orders during translation and further experimented in assimilating the source languages syntax to the target word order using pre-ordering. We focused on the field of extremely low-resource scenarios. We also conducted experiments on practical data augmentation techniques that support the reordering capacity of the models through varying the target objectives, adding the secondary goal of removing noises or reordering broken input sequences. In particular, we propose methods to improve translat on quality with the denoising autoencoder in Neural Machine Translation (NMT) and pre-ordering method in Phrase-based Statistical Machine Translation (PBSMT). The experiments with a number of English-Vietnamese pairs show the improvement in BLEU scores as compared to both the NMT and SMT systems.

Ključne riječi

Machine translation, Phrase-based statistical machine translation, Neural machine translation, Pre-ordering, Denoising autoencoder

Hrčak ID:

274353

URI

https://hrcak.srce.hr/274353

Datum izdavanja:

21.3.2022.

Posjeta: 883 *

Prijava i registracija

Journal of computing and information technology, Vol. 29 No. 1, 2021.

Sažetak

Ključne riječi

Hrčak ID:

URI

Datum izdavanja: