ON THE QUESTION OF THE APPLICATION OF STATISTICAL METHODS TO
SEARCH FOR COLLOCATIONS AND COLLIGATIONS IN OLD SLAVONIC
TEXTS (IN GLAGOLITIC MANUSCRIPTS FROM THE CORPUS
»manuscripts.ru«)

БАРАНОВ, Виктор A.

doi:10.31745/s.69.1

Slovo : Journal of Old Church Slavonic Institute, No. 69, 2019.

Original scientific paper

https://doi.org/10.31745/s.69.1

ON THE QUESTION OF THE APPLICATION OF STATISTICAL METHODS TO SEARCH FOR COLLOCATIONS AND COLLIGATIONS IN OLD SLAVONIC TEXTS (IN GLAGOLITIC MANUSCRIPTS FROM THE CORPUS »manuscripts.ru«)

Виктор A. БАРАНОВ orcid.org/0000-0003-1730-6359 ; Izhevsk State Technical University after M.T. Kalashnikov Izhevsk (Russia)

Full text: russian pdf 481 Kb

page 1-33

downloads: 641

cite

APA 6th Edition

БАРАНОВ, В.A. (2019). ON THE QUESTION OF THE APPLICATION OF STATISTICAL METHODS TO SEARCH FOR COLLOCATIONS AND COLLIGATIONS IN OLD SLAVONIC TEXTS (IN GLAGOLITIC MANUSCRIPTS FROM THE CORPUS »manuscripts.ru«). Slovo, (69), 33-33. https://doi.org/10.31745/s.69.1

MLA 8th Edition

БАРАНОВ, Виктор A.. "ON THE QUESTION OF THE APPLICATION OF STATISTICAL METHODS TO SEARCH FOR COLLOCATIONS AND COLLIGATIONS IN OLD SLAVONIC TEXTS (IN GLAGOLITIC MANUSCRIPTS FROM THE CORPUS »manuscripts.ru«)." Slovo, vol. , no. 69, 2019, pp. 33-33. https://doi.org/10.31745/s.69.1. Accessed 22 Dec. 2024.

Chicago 17th Edition

БАРАНОВ, Виктор A.. "ON THE QUESTION OF THE APPLICATION OF STATISTICAL METHODS TO SEARCH FOR COLLOCATIONS AND COLLIGATIONS IN OLD SLAVONIC TEXTS (IN GLAGOLITIC MANUSCRIPTS FROM THE CORPUS »manuscripts.ru«)." Slovo , no. 69 (2019): 33-33. https://doi.org/10.31745/s.69.1

Harvard

БАРАНОВ, В.A. (2019). 'ON THE QUESTION OF THE APPLICATION OF STATISTICAL METHODS TO SEARCH FOR COLLOCATIONS AND COLLIGATIONS IN OLD SLAVONIC TEXTS (IN GLAGOLITIC MANUSCRIPTS FROM THE CORPUS »manuscripts.ru«)', Slovo, (69), pp. 33-33. https://doi.org/10.31745/s.69.1

Vancouver

БАРАНОВ ВA. ON THE QUESTION OF THE APPLICATION OF STATISTICAL METHODS TO SEARCH FOR COLLOCATIONS AND COLLIGATIONS IN OLD SLAVONIC TEXTS (IN GLAGOLITIC MANUSCRIPTS FROM THE CORPUS »manuscripts.ru«). Slovo [Internet]. 2019 [cited 2024 December 22];(69):33-33. https://doi.org/10.31745/s.69.1

IEEE

В.A. БАРАНОВ, "ON THE QUESTION OF THE APPLICATION OF STATISTICAL METHODS TO SEARCH FOR COLLOCATIONS AND COLLIGATIONS IN OLD SLAVONIC TEXTS (IN GLAGOLITIC MANUSCRIPTS FROM THE CORPUS »manuscripts.ru«)", Slovo, vol., no. 69, pp. 33-33, 2019. [Online]. https://doi.org/10.31745/s.69.1

Abstract

The paper deals with the questions concerning the methodology used to search for fixed collocations in the collection of Glagolitic texts in the historical corpus Manuscript: Slavic written heritage (manuscripts.ru) and to evaluate their stability. It demonstrates the possibilities of the
n-gram module to extract collocations, consisting of words and their textual forms or lemmas, with different numbers of components and different frequency of occurrence. Analyzed are digrams and trigrams extracted using the statistical measure of Mutual Information that occur
simultaneously in several manuscripts from the collection.
Particular attention is given to n-grams with high statistical MI values. In accordance with the specifics of the measure, the greatest values belong to the collocations that are rare in the collection. The analysis of such digrams based on the word forms has enabled an identification of coherent grammatical structures – colligations. Trigrams consisting of textual forms are shown to be not only grammatical, but also semantic units – collocations. Digrams with components-lemmas have different forms: preposition-noun collocations, preposition-possessive pronoun collocations and other attributive constructions, relative verb-noun constructions, etc. The analysis of these groups identified both colligations and collocations. Extraction of trigrams on the basis of lemmas was the most productive – the greatest part of the first few dozens of collocations with a maximum MI value are grammatical and semantic units or their parts. A conclusion is made about the efficiency of application of statistical methods for the extraction of collocations and colligations from the corpora comprising medieval Slavonic manuscripts.
A complex solution of the given problem requires the use of different types of n-grams – two-components and triple-components, based on textual forms and lemmas, with free and fixed component order. The presence of grammatical and semantic units repeated in various
manuscripts leads to a conclusion about the supra-textual nature of such collocations.

Keywords

textual corpus; manuscripts.ru; Glagolitic manuscript; linguistic statistics; n-gram module; collocation; colligation

Hrčak ID:

231473

URI

https://hrcak.srce.hr/231473

Publication date:

30.12.2019.

Article data in other languages: croatian russian

Visits: 2.394 *

Login and registration

Slovo : Journal of Old Church Slavonic Institute, No. 69, 2019.

Abstract

Keywords

Hrčak ID:

URI

Publication date: