Skip to the main content

Original scientific paper

https://doi.org/10.32985/ijeces.13.9.4

An empirical study on English-Mizo Statistical Machine Translation with Bible Corpus

Chanambam Sveta Devi ; Department of Computer Science, Assam University, Silchar, Assam, India
Bipul Syam Purkayastha ; Department of Computer Science, Assam University, Silchar, Assam, India
Loitongbam Sanayai Meetei ; Department of Computer Science, National Institute of Technology, Silchar, Assam, India


Full text: english pdf 298 Kb

page 759-765

downloads: 267

cite


Abstract

Machine Translation (MT) is the process of automatically converting the text or speech in one natural language to another language with the help of a machine. This work presents a Bidirectional Statistical Machine Translation (SMT) system of an extremely low resource language pair Mizo-English, built in a low resource setting. A total of 30800 sentences are collected from the English Bible dataset and manually translated to Mizo by a native linguistic expert to generate the English-Mizo parallel dataset. After subjecting to various pre-processing steps, the parallel dataset is used to build our MT system using MOSES tools. Our framework uses different tools, such as GIZA++ for creating the Translation Model (TM) and IRSTLM to determine the probability of the target model. The quality of our MT system is evaluated using two automatic evaluation metrics: BLEU and METEOR. Our MT systems are also manually evaluated using two parameters: adequacy and fluency.

Keywords

Low resource; Statistical Machine Translation; Language Model; Translation Model; English; Mizo; Moses;

Hrčak ID:

286270

URI

https://hrcak.srce.hr/286270

Publication date:

6.12.2022.

Visits: 761 *