EXTRACTING ENGLISH WORDS FROM A CORPUS OF CROATIAN

Borucinsky, Mirjana; Bogunović, Irena

doi:10.31820/f.34.2.13

FLUMINENSIA : Journal for philological research, Vol. 34 No. 2, 2022.

Original scientific paper

https://doi.org/10.31820/f.34.2.13

EXTRACTING ENGLISH WORDS FROM A CORPUS OF CROATIAN

Mirjana Borucinsky orcid.org/0000-0002-1132-9720 ; Sveučilište u Rijeci, Pomorski fakultet
Irena Bogunović orcid.org/0000-0002-2956-7014 ; Sveučilište u Rijeci, Pomorski fakultet

Full text: croatian pdf 1.656 Kb

page 435-461

downloads: 637

cite

APA 6th Edition

Borucinsky, M. & Bogunović, I. (2022). EXTRACTING ENGLISH WORDS FROM A CORPUS OF CROATIAN. FLUMINENSIA, 34 (2), 461-461. https://doi.org/10.31820/f.34.2.13

MLA 8th Edition

Borucinsky, Mirjana and Irena Bogunović. "EXTRACTING ENGLISH WORDS FROM A CORPUS OF CROATIAN." FLUMINENSIA, vol. 34, no. 2, 2022, pp. 461-461. https://doi.org/10.31820/f.34.2.13. Accessed 8 Jan. 2025.

Chicago 17th Edition

Borucinsky, Mirjana and Irena Bogunović. "EXTRACTING ENGLISH WORDS FROM A CORPUS OF CROATIAN." FLUMINENSIA 34, no. 2 (2022): 461-461. https://doi.org/10.31820/f.34.2.13

Harvard

Borucinsky, M., and Bogunović, I. (2022). 'EXTRACTING ENGLISH WORDS FROM A CORPUS OF CROATIAN', FLUMINENSIA, 34(2), pp. 461-461. https://doi.org/10.31820/f.34.2.13

Vancouver

Borucinsky M, Bogunović I. EXTRACTING ENGLISH WORDS FROM A CORPUS OF CROATIAN. FLUMINENSIA [Internet]. 2022 [cited 2025 January 08];34(2):461-461. https://doi.org/10.31820/f.34.2.13

IEEE

M. Borucinsky and I. Bogunović, "EXTRACTING ENGLISH WORDS FROM A CORPUS OF CROATIAN", FLUMINENSIA, vol.34, no. 2, pp. 461-461, 2022. [Online]. https://doi.org/10.31820/f.34.2.13

Abstract

As the lingua franca of the modern age, English has become the dominant donor language for many languages, including Croatian. The influence of English on Croatian is evident across different registers and linguistic levels, especially the lexical one. Recently, more and more English words have started to appear in their unadapted form (e.g., freelancer, chat, e-mail) in Croatian, especially in the news and social media. English words can be extracted from corpora either manually, by using existing corpus linguistics tools or by developing new tools. The aim of this paper is to analyse whether the existing tools for Croatian can yield a list of unadapted English words. For that purpose, the web corpus (hrWaC) was analysed using the Sketch Engine platform. A list of 1217 English words was composed using this method. The results showed that it is possible to compile a list of English words and their frequencies with the help of the available tools and resources for the Croatian language, but also that there are many
problems due to which the results cannot be considered completely reliable. Moreover, the procedure itself still has to be combined with other manual methods and classifications, and there is a need for the development of new tools for automatic
extraction of English words from a corpus of Croatian.

Keywords

English words; Croatian language; corpus linguistics

Hrčak ID:

289279

URI

https://hrcak.srce.hr/289279

Publication date:

30.12.2022.

Article data in other languages: croatian

Visits: 1.553 *

Login and registration

FLUMINENSIA : Journal for philological research, Vol. 34 No. 2, 2022.

Abstract

Keywords

Hrčak ID:

URI

Publication date: