Maximin Coavoux

CNRS Researcher (Chargé de recherche)

Laboratoire d'Informatique de Grenoble (LIG), Université Grenoble-Alpes

Title: Structural Biases for Compositional Semantic Prediction

Scientific context

Compositionality is a foundational hypothesis in formal semantics and states that the semantic interpretation of an utterance is a function of its parts and how they are combined (i.e. their syntactic structure). In NLP, the current dominant paradigm is to design end-to-end models with no intermediate linguistically interpretable representations, which is often motivated by the fact that pretrained language models implicitly encode latent syntactical representations. However, recent studies suggest that the syntactic information learned by language models are insufficient and that, in their current form, they are unable to exploit the syntactic information provided in their input when they need to generate a structured output.

Strikingly, most systems that obtained decent results on compositional generalization benchmarks either (i) include some data augmentation methods that increase the exposure of the model to diverse syntactic structures at training time, or (ii) resort to a natural language parser and hand-crafted rules to derive the semantic representation from the syntactic tree. These two approaches are efficient, but they still have limitations that need to be addressed. Firstly, data augmentation bypasses the issue altogether, is tied to a particular dataset or task and requires additional computation, both for generating new data and for re-training or fine-tuning models. Secondly, approach (ii) leaves the seq2seq framework for a more conceptually complex framework, and often uses architectures that are tied to specific data or tasks. In contrast, we believe that with proper built-in inductive biases, a seq2seq model might provide a simple, yet effective solution to the structural compositionality issue.

PhD Proposal

The goal of this PhD wil be to explore inductive biases related to linguistic structures, in an attempt to build small NLP models with compositional skills, i.e. models with built-in knowledge making them able to infer generalization rules from few data points. Research directions will be defined together with the successful applicant (who is encouraged to bring their own ideas!) and may include:

