Filologija, No. 59, 2012.
Pregledni rad
On the Corpus of the Dictionary of Church Slavonic Language of Croatian Redaction and its Relation Towards the Croatian Language Corpora
Vida Vukoja
; Staroslavenski institut
Sažetak
The corpus for the Dictionary of the Croatian Redaction of Church Slavonic (acronym: DCRCS) is a corpus of Croatian Church Slavonic excerpted texts created in the Old Church Slavonic Institute in Zagreb not only for the dictionary compilation but also for paleographic, grammatical, philological and semantic research. Among corpora of the Croatian language, the greatest number of characteristics it shares with the corpus for the compilation of the Dictionary of Old Croatian (acronym: DOC) which has been created in the Institute of Croatian Language and Linguistics in Zagreb. They consist of the language material of close idioms (Croatian Church Slavonic and medieval Croatian, respectively), from practically identical period (XI/XII—XVI c.). Also, they are both referential corpora, having characteristics of static and dynamic corpora; both consist of integral written texts (of which many in several variants) extracted from the manuscripts and incunabulas. They partially correspond in terms of their basic internal structure: the structure of the corpus of the DCRCS is based on the codex contents and the contents of particular texts, while the structure of the corpus for the DOC is primarely based on the contents of the texts.
Corpora for the DCRCS and for the DOC differ in a number of characteristics. In contrast to the corpus for the DOC, the corpus for the DCRCS is organized as a computer-unreadable paper card-file (only partially it is available in JPEG format). In addition, it is mostly annotated (the corpus for the DOC is not currently annoted, but hoped to be in the future), parallel (the corpus for the DOC being monolingual). The texts for the corpus for the DCRCS are transliterated in Old Cyrillic script and normalized forms are present only in lemmatization (the texts of the corpus of the DOC are inscribed in Latin script and standardized according to the principles of phonological transcription).
The existing corpus for the DCRCS is, and most probably will remain for long time or for evermore, an irreplaceable source of the Croatian Church Slavonic language data even though a valuable corpus of digitalized Croatian Church Slavonic texts has been created in the Old Church Slavonic Institute. Hopefully, the corpus for the DCRCS, combined with the corpus for the DOC and the mentioned corpus of the Croatian Church Slavonic texts (both in progress) will once create a very solid ground for the extensive research on the linguistic praxis in the medieval Croatian lands.
Ključne riječi
corpus; Dictionary of the Croatian Redaction of Church Slavonic; Dictionary of Old Croatian
Hrčak ID:
98094
URI
Datum izdavanja:
12.3.2013.
Posjeta: 2.873 *