Skoči na glavni sadržaj

Izvorni znanstveni članak

https://doi.org/10.17234/SRAZ.70.6

Compiling a Corpus for Term Extraction Aimed at the Socioterminological Systematization of Mechatronics Terminology

Ivana Jurković ; Fakultet strojarstva i brodogradnje Sveučilišta u Zagrebu, Zagreb, Hrvatska
Dalibor Vrgoč orcid id orcid.org/0000-0002-6259-0187 ; Institut za hrvatski jezik, Zagreb, Hrvatska


Puni tekst: engleski pdf 580 Kb

str. 99-122

preuzimanja: 67

citiraj


Sažetak

This paper explores the methodological framework for compiling a specialized corpus
intended for term extraction, with the objective of supporting the socioterminological
systematization of mechatronics terminology. The primary aim of the study is
to develop empirically grounded strategies for corpus compilation that enhance
the reliability and efficiency of term extraction. It was hypothesized that a corpusdriven
approach, when applied exclusively to a didactic subcorpus (mechatronics
textbooks), allows for a more precise extraction of term candidates than the approach
involving an academic subcorpus (scientific papers in the field of mechatronics). To
test this hypothesis, two subcorpora of mechatronics texts written in English were
compiled (didactic and academic). Terms were extracted from both the didactic and
the academic subcorpus of mechatronics texts and compared to the items extracted
by other extraction functions available in Sketch Engine. The comparative statistical
analysis has demonstrated that the didactic subcorpus allows for the extraction of
more term candidates, while the academic subcorpus produces more noise. Thus, it
is argued that a smaller, balanced didactic corpus is more suitable for term extraction
aimed at the socioterminological systematization of mechatronics terminology than
a larger corpus involving the academic subcorpus. The contribution of the outlined
methodology is reflected in greater efficiency in socioterminological systematization
of mechatronics terminology. Thus, it is proposed that it may be effectively extended
to other interdisciplinary domains.

Ključne riječi

corpus-driven approach; mechatronics; socioterminology; term extraction; terminology systematization

Hrčak ID:

345580

URI

https://hrcak.srce.hr/345580

Datum izdavanja:

18.12.2025.

Posjeta: 231 *