On the Acquisition of Shared Grammatical Representations in Bilingual Language Models

Catherine Arnett, Tyler A. Chang, James A. Michaelov, Benjamin K. Bergen

2025-03-07

On the Acquisition of Shared Grammatical Representations in Bilingual
Language Models

Summary

This paper talks about how AI language models learn to understand and use multiple languages, focusing on what happens when a model that knows one language starts learning a second one

What's the problem?

We don't fully understand how AI models that can work with multiple languages actually learn to transfer knowledge between those languages. This makes it hard to improve these models or use them to understand how humans learn languages

What's the solution?

The researchers created small AI models that could speak two languages and carefully controlled how much of each language the models learned. They then used a technique called structural priming, which is usually used to study how humans understand grammar, to see if the AI models were developing shared understanding between the languages. They found that some language pairs shared more understanding than others, and that the order in which languages were learned mattered

Why it matters?

This matters because it helps us understand how AI models learn multiple languages, which could lead to better multilingual AI systems. It also gives us new insights into how humans might learn languages, especially the idea that some language pairs might be easier to learn together than others. This research could help improve language learning methods for both AI and humans

Abstract

While crosslingual transfer is crucial to contemporary language models' multilingual capabilities, how it occurs is not well understood. In this paper, we ask what happens to a monolingual language model when it begins to be trained on a second language. Specifically, we train small bilingual models for which we control the amount of data for each language and the order of language exposure. To find evidence of shared multilingual representations, we turn to structural priming, a method used to study grammatical representations in humans. We first replicate previous crosslingual structural priming results and find that after controlling for training data quantity and language exposure, there are asymmetrical effects across language pairs and directions. We argue that this asymmetry may shape hypotheses about human structural priming effects. We also find that structural priming effects are less robust for less similar language pairs, highlighting potential limitations of crosslingual transfer learning and shared representations for typologically diverse languages.

View Paper