Izvorni znanstveni članak
https://doi.org/10.1080/00051144.2021.1928437
Representing word meaning in context via lexical substitutes
Domagoj Alagić
; Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
Jan Šnajder
; Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
Sažetak
Representing the meaning of individual words is crucial for most natural language processing (NLP) tasks. This, however, is a challenge because word meaning often depends on the context. Recent approaches to representing word meaning in context rely on lexical substitution (LS), where a word is represented with a set of meaning-preserving substitutes. While face valid, it is not clear to what extent substitute-based representation corresponds to the more established sense-based representation required for many NLP tasks. We present an empirical study that addresses this question by quantifying the correspondence between substitute- and sense-based meaning representations. We compile a high-quality dataset annotated with lexical substitutes and sense labels from two well-established sense inventories, and conduct a correlation analysis using a number of substitute-based similarity measures. Furthermore, as recent work has demonstrated the efficacy of system-produced substitutes for word meaning representation, we compare human- and system-produced substitutes to determine the performance gap between the two. Lastly, we investigate to what extent the results translate to the fundamental semantic task of word sense induction (WSI). Our experiments show the validity of LS for word meaning in context representation and justify the use of system-produced substitutes for WSI.
Ključne riječi
Natural language processing; machine learning; lexical substitution; word meaning in context; word sense induction
Hrčak ID:
269830
URI
Datum izdavanja:
4.6.2021.
Posjeta: 761 *