Revisiting In-Context Learning with Long Context Language Models
Jinheon Baek, Sun Jae Lee, Prakhar Gupta, Geunseob, Oh, Siddharth Dalmia, Prateek Kolhar
2024-12-24

Summary
This paper talks about how In-Context Learning (ICL) works with Long Context Language Models (LCLMs). It explores whether the way examples are chosen affects the model's performance when it can use many examples at once.
What's the problem?
In the past, ICL was limited by the size of the context window, meaning models could only use a few examples to learn from. This made it important to carefully select which examples to show. However, with LCLMs, which can handle more examples, it's unclear if the method of selecting examples still matters as much.
What's the solution?
The authors conducted experiments using LCLMs on various datasets to see how different methods of example selection impacted performance. Surprisingly, they found that using complex selection techniques didn't improve results much compared to simply choosing examples at random. Instead, they discovered that the main challenge shifted to ensuring there were enough examples to fill the context window. By adding more examples through a simple data augmentation method, they improved performance by 5%.
Why it matters?
This research is important because it helps us understand how to use large language models more effectively. By showing that simpler methods can work well even with many examples, it opens up new possibilities for using ICL in real-world applications where having lots of data is beneficial.
Abstract
In-Context Learning (ICL) is a technique by which language models make predictions based on examples provided in their input context. Previously, their context window size imposed a limit on the number of examples that can be shown, making example selection techniques crucial for identifying the maximally effective set of examples. However, the recent advent of Long Context Language Models (LCLMs) has significantly increased the number of examples that can be included in context, raising an important question of whether ICL performance in a many-shot regime is still sensitive to the method of sample selection. To answer this, we revisit these approaches in the context of LCLMs through extensive experiments on 18 datasets spanning 4 tasks. Surprisingly, we observe that sophisticated example selection techniques do not yield significant improvements over a simple random sample selection method. Instead, we find that the advent of LCLMs has fundamentally shifted the challenge of ICL from that of selecting the most effective examples to that of collecting sufficient examples to fill the context window. Specifically, in certain datasets, including all available examples does not fully utilize the context window; however, by augmenting the examples in context with a simple data augmentation approach, we substantially improve ICL performance by 5%.