Skip to the main content

Preliminary communication

https://doi.org/10.31724/rihjj.46.2.21

‟Deep lexicography” – Fad or Opportunity?

Nikola Ljubešić orcid id orcid.org/0000-0001-7169-9152 ; Jožef Stefan Institute


Full text: english pdf 779 Kb

page 839-852

downloads: 474

cite


Abstract

In recent years, we are witnessing staggering improvements in various semantic data processing tasks due to the developments in the area of deep learning, ranging from image and video processing to speech processing, and natural language understanding. In this paper, we discuss the opportunities and challenges that these developments pose for the area of electronic lexicography. We primarily focus on the concept of representation learning of the basic elements of language, namely words, and the applicability of these word representations to lexicography. We first discuss well-known approaches to learning static representations of words, the so-called word embeddings, and their usage in lexicography-related tasks such as semantic shift detection, and cross-lingual prediction of lexical features such as concreteness and imageability. We wrap up the paper with the most recent developments in the area of word representation learning in form of learning dynamic, context-aware representations of words, showcasing some dynamic word embedding examples, and discussing improvements on lexicography-relevant tasks of word sense disambiguation and word sense induction.

Keywords

digital lexicography; deep learning; representation learning

Hrčak ID:

245473

URI

https://hrcak.srce.hr/245473

Publication date:

30.10.2020.

Article data in other languages: croatian

Visits: 1.158 *