Skip to the main content

Original scientific paper

https://doi.org/10.17234/SRAZ.70.6

Compiling a Corpus for Term Extraction Aimed at the Socioterminological Systematization of Mechatronics Terminology

Ivana Jurković ; Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Zagreb, Croatia
Dalibor Vrgoč orcid id orcid.org/0000-0002-6259-0187 ; Institute for the Croatian Language, Zagreb, Croatia


Full text: english pdf 580 Kb

page 99-122

downloads: 67

cite


Abstract

This paper explores the methodological framework for compiling a specialized corpus
intended for term extraction, with the objective of supporting the socioterminological
systematization of mechatronics terminology. The primary aim of the study is
to develop empirically grounded strategies for corpus compilation that enhance
the reliability and efficiency of term extraction. It was hypothesized that a corpusdriven
approach, when applied exclusively to a didactic subcorpus (mechatronics
textbooks), allows for a more precise extraction of term candidates than the approach
involving an academic subcorpus (scientific papers in the field of mechatronics). To
test this hypothesis, two subcorpora of mechatronics texts written in English were
compiled (didactic and academic). Terms were extracted from both the didactic and
the academic subcorpus of mechatronics texts and compared to the items extracted
by other extraction functions available in Sketch Engine. The comparative statistical
analysis has demonstrated that the didactic subcorpus allows for the extraction of
more term candidates, while the academic subcorpus produces more noise. Thus, it
is argued that a smaller, balanced didactic corpus is more suitable for term extraction
aimed at the socioterminological systematization of mechatronics terminology than
a larger corpus involving the academic subcorpus. The contribution of the outlined
methodology is reflected in greater efficiency in socioterminological systematization
of mechatronics terminology. Thus, it is proposed that it may be effectively extended
to other interdisciplinary domains.

Keywords

corpus-driven approach; mechatronics; socioterminology; term extraction; terminology systematization

Hrčak ID:

345580

URI

https://hrcak.srce.hr/345580

Publication date:

18.12.2025.

Visits: 231 *