Skip to the main content

Preliminary communication

https://doi.org/10.17559/TV-20150831012553

Document similarity in repeatedly translated corpora

Vladimir Mateljan ; University of Zagreb, Faculty of Humanities and Social Sciences, Ivana Lučića 3, 10000 Zagreb, Croatia
Vedran Juričić ; University of Zagreb, Faculty of Humanities and Social Sciences, Ivana Lučića 3, 10000 Zagreb, Croatia
Dario Ogrizović ; University of Rijeka, Faculty of Maritime Studies, Studentska 2, 51000 Rijeka, Croatia


Full text: croatian pdf 387 Kb

page 599-602

downloads: 560

cite

Full text: english pdf 387 Kb

page 599-602

downloads: 316

cite


Abstract

The paper analyses the changes in relationship between documents in textual corpus that occur due to the translation into another language. Authors analyzed the similarities between documents in original corpus, in Croatian, and compared them with the corresponding documents in translated corpus, in English. The changes were analyzed using two measures, chi-square test’s P-value and new proposed measure, correction coefficient.

Keywords

analysis; document similarity; multilingual; translated corpus; translation

Hrčak ID:

179882

URI

https://hrcak.srce.hr/179882

Publication date:

14.4.2017.

Article data in other languages: croatian

Visits: 2.151 *