< Explain other AI papers

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Tongyao Zhu, Qian Liu, Haonan Wang, Shiqi Chen, Xiangming Gu, Tianyu Pang, Min-Yen Kan

2025-03-20

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Summary

This paper is about making AI language models train faster and perform better by carefully changing the amount of text they can see at once during training.

What's the problem?

AI models trained with long text windows aren't always better than those trained with shorter windows, even though they should be able to learn more.

What's the solution?

The researchers created SkyLadder, a method that starts with short text windows and gradually increases them during training, allowing the AI to learn efficiently.

Why it matters?

This work matters because it can lead to faster and more effective AI models that can process longer pieces of text, which is useful for tasks like understanding books or summarizing long conversations.

Abstract

Recent advancements in LLM pretraining have featured ever-expanding context windows to process longer sequences. However, our pilot study reveals that models pretrained with shorter context windows consistently outperform their long-context counterparts under a fixed token budget. This finding motivates us to explore an optimal context window scheduling strategy to better balance long-context capability with pretraining efficiency. To this end, we propose SkyLadder, a simple yet effective approach that implements a short-to-long context window transition. SkyLadder preserves strong standard benchmark performance, while matching or exceeding baseline results on long context tasks. Through extensive experiments, we pre-train 1B-parameter models (up to 32K context) and 3B-parameter models (8K context) on 100B tokens, demonstrating that SkyLadder yields consistent gains of up to 3.7% on common benchmarks, while achieving up to 22% faster training speeds compared to baselines. The code is at https://github.com/sail-sg/SkyLadder.