Izvorni znanstveni članak
https://doi.org/10.32985/ijeces.13.9.4
An empirical study on English-Mizo Statistical Machine Translation with Bible Corpus
Chanambam Sveta Devi
; Department of Computer Science, Assam University, Silchar, Assam, India
Bipul Syam Purkayastha
; Department of Computer Science, Assam University, Silchar, Assam, India
Loitongbam Sanayai Meetei
; Department of Computer Science, National Institute of Technology, Silchar, Assam, India
Sažetak
Machine Translation (MT) is the process of automatically converting the text or speech in one natural language to another language with the help of a machine. This work presents a Bidirectional Statistical Machine Translation (SMT) system of an extremely low resource language pair Mizo-English, built in a low resource setting. A total of 30800 sentences are collected from the English Bible dataset and manually translated to Mizo by a native linguistic expert to generate the English-Mizo parallel dataset. After subjecting to various pre-processing steps, the parallel dataset is used to build our MT system using MOSES tools. Our framework uses different tools, such as GIZA++ for creating the Translation Model (TM) and IRSTLM to determine the probability of the target model. The quality of our MT system is evaluated using two automatic evaluation metrics: BLEU and METEOR. Our MT systems are also manually evaluated using two parameters: adequacy and fluency.
Ključne riječi
Low resource; Statistical Machine Translation; Language Model; Translation Model; English; Mizo; Moses;
Hrčak ID:
286270
URI
Datum izdavanja:
6.12.2022.
Posjeta: 761 *