Original scientific paper
https://doi.org/10.31724/rihjj.49.2.12
Lithuanian-English Cybersecurity Termbase: Principles of Data Collection and Structuring
Sigita Rackevičienė
; Mykolas Romeris University, Faculty of Human and Social Studies, Institute of Humanities
Andrius Utka
orcid.org/0000-0001-5212-4310
; Vytautas Magnus University, Institute of Digital Resources and Interdisciplinary Research
Agnė Bielinskienė
; Vytautas Magnus University, Institute of Digital Resources and Interdisciplinary Research
Liudmila Mockienė
orcid.org/0000-0001-7153-7276
; Mykolas Romeris University, Faculty of Human and Social Studies, Institute of Humanities
Abstract
The aim of the paper is to present compilation and structuring principles, scope and development possibilities of the bilingual Lithuanian-English cybersecurity termbase. The paper discusses different approaches to terminology management, the best practices of which have been used to collect cybersecurity terminology and compile the termbase. Data collection has been mainly based on semasiological and corpus-driven approaches involving creation of deep learning systems trained to extract terminology from the cybersecurity corpora. To achieve systematicity and comprehensiveness of the dataset, the onomasiological and corpus-based approaches have also been incorporated in the data collection process. The termbase design decisions (its macrostructure and microstructure) have been based on onomasiological principles, while term variation has been handled by applying the descriptive approach. The termbase has been developed in the open-source cloud-based terminological management platform Terminologue. To ensure interoperability, the termbase has been exported into the TBX format and deposited into the CLARIN-LT repository. The paper also discusses possibilities of publishing terminological data as linguistic linked open data and linking it with other terminological resources and cybersecurity ontologies. The termbase is expected to be useful for cybersecurity specialists, translators, terminographers, lexicographers and the general public, as well as to contribute to the development of the Lithuanian cybersecurity terminology.
Keywords
cybersecurity; termbase; terminology management; termbase structure; LLOD
Hrčak ID:
311289
URI
Publication date:
13.12.2023.
Visits: 1.032 *