Exploring the Interplay of Lexis and Grammar Through N-Grams in English and Croatian

Borucinsky, Mirjana

doi:10.31820/f.37.2.3

FLUMINENSIA : časopis za filološka istraživanja, Vol. 37 No. 2, 2025.

Izvorni znanstveni članak

https://doi.org/10.31820/f.37.2.3

Exploring the Interplay of Lexis and Grammar Through N-Grams in English and Croatian

Mirjana Borucinsky orcid.org/0000-0002-1132-9720 ; Sveučilište u Rijeci, Pomorski fakultet

Puni tekst: hrvatski pdf 674 Kb

str. 427-457

preuzimanja: 181

citiraj

APA 6th Edition

Borucinsky, M. (2025). Exploring the Interplay of Lexis and Grammar Through N-Grams in English and Croatian. FLUMINENSIA, 37 (2), 427-457. https://doi.org/10.31820/f.37.2.3

MLA 8th Edition

Borucinsky, Mirjana. "Exploring the Interplay of Lexis and Grammar Through N-Grams in English and Croatian." FLUMINENSIA, vol. 37, br. 2, 2025, str. 427-457. https://doi.org/10.31820/f.37.2.3. Citirano 20.07.2026.

Chicago 17th Edition

Borucinsky, Mirjana. "Exploring the Interplay of Lexis and Grammar Through N-Grams in English and Croatian." FLUMINENSIA 37, br. 2 (2025): 427-457. https://doi.org/10.31820/f.37.2.3

Harvard

Borucinsky, M. (2025). 'Exploring the Interplay of Lexis and Grammar Through N-Grams in English and Croatian', FLUMINENSIA, 37(2), str. 427-457. https://doi.org/10.31820/f.37.2.3

Vancouver

Borucinsky M. Exploring the Interplay of Lexis and Grammar Through N-Grams in English and Croatian. FLUMINENSIA [Internet]. 2025 [pristupljeno 20.07.2026.];37(2):427-457. https://doi.org/10.31820/f.37.2.3

IEEE

M. Borucinsky, "Exploring the Interplay of Lexis and Grammar Through N-Grams in English and Croatian", FLUMINENSIA, vol.37, br. 2, str. 427-457, 2025. [Online]. https://doi.org/10.31820/f.37.2.3

Sažetak

This paper looks at the intricate relationship between lexis and grammar by studying N-grams in English and Croatian. N-grams, i.e. sequences of N words extracted from corpora, are also referred to as lexical bundles (Biber et al. 1999), clusters (Scott 2008), chains (Stubbs 2001; Stubbs & Barth, 2003), recurrent sequences (De Cock 2004), and recurrent word combinations (Altenberg 1998). Lexical bundles (e.g. as well as, in order to, in case of, in terms of) are formulaic sequences that provide “the building blocks of coherent discourse” (Hyland 2008: 6). The lexical bundle approach has focused mainly on register and text-type differences (Hyland 2008), trying to answer the question whether there is a discipline specific lexical repertoire or a core vocabulary. One such attempt is described in Borucinsky and Pritchard (2022). However, since lexical bundles present incomplete structural units that cross grammatical structures (Biber, Conrad and Cortes 2004), there is room for further applications of this approach, both in linguistics and in language teaching (e.g. in cross-linguistic analyses), as suggested by Römer (2009). One such application could be the understanding of the interplay between lexis and grammar. Hence, this paper aims at answering the following question: What can we uncover about the interplay of lexis and grammar through corpus-based research by studying MWEs, and in particular lexical bundles? The starting point of the analysis is the English language, while a contrastive analysis aims to provide insight into MWEs in Croatian, where these multi-word expressions are almost entirely unexplored. We use Sketch Engine (Kilgarriff et al. 2004) to extract N-grams from English and Croatian corpora (enTenTen21 and MaCoCu), and to identify four-word lexical bundles based on the following criteria: (1) a minimum cut-off frequency of at least ten occurrences (cf. Biber et al. 1999); (2) the average reduced frequency (Hlaváčová 2006) and (3) exclusion criteria (cf. Salazar 2014) to eliminate noise resulting from corpus processing. We focus only on NP- and PP-based lexical bundles, and study them crosslinguistically. Furthermore, special attention is paid to grammatical variation or syntactic synonyms such as in case of + Ng (e.g. in case of legal dispute ), in case + CLAUSE (e.g. in cases that could cause conflict); in the event of + Ng (e.g. in the event of an emergency), in the event + CLAUSE (e.g. in the event that no personal representative has been appointed). These forms show that grammatical structures are motivated by meaning and that the boundary between lexis and grammar is fluid. This is even further brought to light through contrastive analysis, as illustrated by the following examples: u slučaju + Ng (e.g. u slučaju pogiblji lit.‘in case of distress’) and u slučaju + CLAUSE (e.g. u slučaju da članica ne odgovori na upozorenje lit. ‘in case that the member does not respond to the warning’; u slučaju kada sud donese rješenje ‘in case when the court passes a decision’). The contribution of this paper is threefold. Empirically, it provides new data on the use of lexico-grammatical constructions in Croatian. Theoretically, it confirms that the structure of these constructions is motivated by meaning and that it interacts with grammar in different ways in the two languages. Methodologically, the study demonstrates the effectiveness of combining frequency-based and functional analysis in contrastive lexico-grammar. The results may have multiple applications: in language teaching, where raising awareness of frequent patterns can contribute to the development of language competence, in contrastive research that links lexico-grammatical patterns across different languages, and in advancing corpus methodology, particularly with regard to more precise annotation of nominal groups and their postmodifiers.

Ključne riječi

lexicogrammar; corpus linguistics; N-grams; lexical bundles; English; Croatian

Hrčak ID:

342893

URI

https://hrcak.srce.hr/342893

Datum izdavanja:

31.12.2025.

Podaci na drugim jezicima: hrvatski

Posjeta: 689 *

Prijava i registracija

FLUMINENSIA : časopis za filološka istraživanja, Vol. 37 No. 2, 2025.

Sažetak

Ključne riječi

Hrčak ID:

URI

Datum izdavanja: