Izvorni znanstveni članak
https://doi.org/10.17234/SRAZ.70.6
Compiling a Corpus for Term Extraction Aimed at the Socioterminological Systematization of Mechatronics Terminology
Ivana Jurković
; Fakultet strojarstva i brodogradnje Sveučilišta u Zagrebu, Zagreb, Hrvatska
Dalibor Vrgoč
orcid.org/0000-0002-6259-0187
; Institut za hrvatski jezik, Zagreb, Hrvatska
Sažetak
This paper explores the methodological framework for compiling a specialized corpus
intended for term extraction, with the objective of supporting the socioterminological
systematization of mechatronics terminology. The primary aim of the study is
to develop empirically grounded strategies for corpus compilation that enhance
the reliability and efficiency of term extraction. It was hypothesized that a corpusdriven
approach, when applied exclusively to a didactic subcorpus (mechatronics
textbooks), allows for a more precise extraction of term candidates than the approach
involving an academic subcorpus (scientific papers in the field of mechatronics). To
test this hypothesis, two subcorpora of mechatronics texts written in English were
compiled (didactic and academic). Terms were extracted from both the didactic and
the academic subcorpus of mechatronics texts and compared to the items extracted
by other extraction functions available in Sketch Engine. The comparative statistical
analysis has demonstrated that the didactic subcorpus allows for the extraction of
more term candidates, while the academic subcorpus produces more noise. Thus, it
is argued that a smaller, balanced didactic corpus is more suitable for term extraction
aimed at the socioterminological systematization of mechatronics terminology than
a larger corpus involving the academic subcorpus. The contribution of the outlined
methodology is reflected in greater efficiency in socioterminological systematization
of mechatronics terminology. Thus, it is proposed that it may be effectively extended
to other interdisciplinary domains.
Ključne riječi
corpus-driven approach; mechatronics; socioterminology; term extraction; terminology systematization
Hrčak ID:
345580
URI
Datum izdavanja:
18.12.2025.
Posjeta: 231 *