Computer Designed Corpus of the Old-Croatian Language

Barbarić, Vuk-Tadija; Kapetanović, Amir

Filologija, No. 59, 2012.

Review article

Computer Designed Corpus of the Old-Croatian Language

Vuk-Tadija Barbarić orcid.org/0000-0003-1001-437X ; Institut za hrvatski jezik i jezikoslovlje
Amir Kapetanović orcid.org/0000-0002-8013-9330 ; Institut za hrvatski jezik i jezikoslovlje

Full text: croatian pdf 214 Kb

page 1-13

downloads: 801

cite

APA 6th Edition

Barbarić, V. & Kapetanović, A. (2012). Computer Designed Corpus of the Old-Croatian Language. Filologija, (59), 0-0. Retrieved from https://hrcak.srce.hr/98083

MLA 8th Edition

Barbarić, Vuk-Tadija and Amir Kapetanović. "Computer Designed Corpus of the Old-Croatian Language." Filologija, vol. , no. 59, 2012, pp. 0-0. https://hrcak.srce.hr/98083. Accessed 27 Dec. 2024.

Chicago 17th Edition

Barbarić, Vuk-Tadija and Amir Kapetanović. "Computer Designed Corpus of the Old-Croatian Language." Filologija , no. 59 (2012): 0-0. https://hrcak.srce.hr/98083

Harvard

Barbarić, V., and Kapetanović, A. (2012). 'Computer Designed Corpus of the Old-Croatian Language', Filologija, (59), pp. 0-0. Available at: https://hrcak.srce.hr/98083 (Accessed 27 December 2024)

Vancouver

Barbarić V, Kapetanović A. Computer Designed Corpus of the Old-Croatian Language. Filologija [Internet]. 2012 [cited 2024 December 27];(59). Available from: https://hrcak.srce.hr/98083

IEEE

V. Barbarić and A. Kapetanović, "Computer Designed Corpus of the Old-Croatian Language", Filologija, vol., no. 59, pp. 0-0, 2012. [Online]. Available: https://hrcak.srce.hr/98083. [Accessed: 27 December 2024]

Abstract

Research for the Old-Croatian Dictionary in the Department of Croatian Language History and Historical Lexicography at the Institute of Croatian Language and Linguistics consists of three main stages: textological, computational and lexicographic. The first stage (textological) is thoroughly explained in Kapetanović’s article Digitization of Old Croatian Texts and Textual Criticism.
This paper explains the second stage — computational corpus design for the mentioned dictionary. Therefore, the analysis is focused on anticipating problems in designing a machine-readable corpus of texts written in the Old Croatian language, and on providing possible solutions. The computational processing of the texts for the corpus is described, as are problems which arise from the selection of TEI and Philologic server application. Priorities in corpus design are determined in order to overcome the mentioned problems and other potential difficulties. One possible solution to the separation of the apparatus criticus from the Old Croatian texts is proposed. All proposals are aimed at a more efficient encoding of primary data and metadata with respect to users, while remaining within the capabilities of the project.
The paper also deals with the third, lexicographic stage by giving an overview of the chosen lexicographic tool TshwaneLex.

Keywords

Old-Croatian language; corpus; TEI standard

Hrčak ID:

98083

URI

https://hrcak.srce.hr/98083

Publication date:

12.3.2013.

Article data in other languages: croatian

Visits: 2.179 *

Login and registration

Filologija, No. 59, 2012.

Abstract

Keywords

Hrčak ID:

URI

Publication date: