|Integration of Multiword Expression Recognition in Parsers, Mathieu Constant (Univ. Marne-La-Vallee)|
Automatic linguistic analysis faces two major problems inherent in natural languages: ambiguity and multiword expressions (MWE). Whereas the literature abounds in analyzers trying to deal with the case of ambiguity, few studies tackled the integration of MWE recognition. As these expressions comprise, by definition, a certain degree of non-compositionality (e.g. eau de vie ‘brandy’, perdre la boule ‘to go crazy’), their recognition is thus crucial for applications like Machine Translation.
In this talk, we will focus on the integration of compounds (i.e. a type of contiguous MWEs) in parsers. We will tackle this problem with a hybrid approach combining statistical models and symbolic linguistic resources. We will show that such an approach not only makes it possible to improve compound recognition, but also the global accuracy of parsing. We will consider several strategies for constituency parsing as well as for dependency parsing. In particular, we will compare experimentally joint strategies with pipeline ones.