Filologija, No. 84, 2025.
Original scientific paper
https://doi.org/10.21857/y26kecd1n9
Why are language technologies important for the future of the Croatian language?
Marko Tadić
Abstract
The paper provides a definition, composition and overview of Language Technologies (LT) as well as the results of two large-scale evaluation campaigns of the development status of LT for dozens of European languages in which Croatian participated. While in the META-NET campaign in 2011, Croatian was placed in a group of languages with underdeveloped language technologies, in the European Language Equality campaign in 2021, it was placed in a group of twenty languages with partially developed language technologies and relative progress was shown. However, it was in the development of language technologies that large language models (LLMs) introduced a paradigm shift and showed that a large number of language tools developed so far will have to be reproduced with a new methodology in the background, i.e. one that uses LLMs. The paper further provides an overview of the basic types of LLMs and explains the difference between artificial intelligence and LLMs. The paper is concluded by pointing out the need for the continuous development of LT for the Croatian language, based precisely on the continuous development of new LLMs for the Croatian language in accordance with any new LM-architecture that will emerge. This requires the provision of unprecedented quantities of textual data in Croatian. Otherwise, the Croatian language may experience “digital illiteracy” and remain beyond the digital divide.
Keywords
Language Technologies, Croatian language, Large Language Models
Hrčak ID:
339779
URI
Publication date:
24.11.2025.
Visits: 493 *