Original scientific paper
https://doi.org/10.5673/sip.60.2.8
German Loanwords in Digital Environment
Lidija Tepeš Golubić
; Zagreb University of Applied Sciences, Croatia
Abstract
The framework of this research make German loan words compiled from the dictionaries of
the Croatian language, dictionaries of foreign words, and PhD thesis studying German loanwords
in the Croatian language. The list of German loanwords found in the contemporary
Croatian texts was analised using linguistic technologies supported by subsequent manual data
processing. The compiled list of gathered loan words allowed for their computational analysis
in the hrWaC web corpus comprising the texts from the whole .hr domain collected in a period
of four years.
This research has established the presence of the German loan words in contemporary Croatian
language texts. Although the total number of German loan words that appeared in contemporary
dictionaries of the Croatian language and other sources consulted during the research was
17 988 lemmas, we were able to confirm the usage for only 8 400 lemmas in contemporary
texts.
Applying the automatic detection method, we have found all German loan words used in
contemporary texts in all of their forms, simultaneously creating the German loan words frequency
dictionary. The confirmed 8 400 lemmas serve as proof that the lexical treasure of the
Croatian language recorded in dictionaries has not been lost in the contemporary Croatian
language. On the contrary, it has systematically entered into the web corpus of the Croatian
language and is found in the texts that do not necessarily belong to the standard language.
Keywords
digital environment; Croatian web corpus hrWaC; computational analysis; German loanword; lemma
Hrčak ID:
285805
URI
Publication date:
17.11.2022.
Visits: 1.767 *