Language Models can Self-Lengthen to Generate Long Texts
Shanghaoran Quan, Tianyi Tang, Bowen Yu, An Yang, Dayiheng Liu, Bofei Gao, Jianhong Tu, Yichang Zhang, Jingren Zhou, Junyang Lin
2024-11-01

Summary
This paper introduces Self-Lengthen, a new training framework that helps large language models (LLMs) generate longer and more coherent texts by allowing them to expand their own responses.
What's the problem?
While LLMs have improved in understanding long texts, they still struggle to create long, well-structured outputs. This is partly because they were trained mostly on short interactions, which limits their ability to handle longer writing tasks. Existing methods for improving long text generation often rely on additional data or complex processes that can be difficult to manage.
What's the solution?
The authors propose Self-Lengthen, an iterative training framework that consists of two main roles: the Generator and the Extender. The Generator creates an initial response to a prompt, and then the Extender takes this response and expands it into a longer text. This process is repeated multiple times, allowing the models to gradually learn how to generate longer and more detailed responses. The framework does not require external data or complex setups, making it easier to implement.
Why it matters?
This research is important because it enhances the ability of LLMs to produce high-quality long texts without needing extensive human input or additional datasets. By improving how these models generate longer content, Self-Lengthen can be useful in various applications such as storytelling, report writing, and any task that requires detailed explanations or narratives.
Abstract
Recent advancements in Large Language Models (LLMs) have significantly enhanced their ability to process long contexts, yet a notable gap remains in generating long, aligned outputs. This limitation stems from a training gap where pre-training lacks effective instructions for long-text generation, and post-training data primarily consists of short query-response pairs. Current approaches, such as instruction backtranslation and behavior imitation, face challenges including data quality, copyright issues, and constraints on proprietary model usage. In this paper, we introduce an innovative iterative training framework called Self-Lengthen that leverages only the intrinsic knowledge and skills of LLMs without the need for auxiliary data or proprietary models. The framework consists of two roles: the Generator and the Extender. The Generator produces the initial response, which is then split and expanded by the Extender. This process results in a new, longer response, which is used to train both the Generator and the Extender iteratively. Through this process, the models are progressively trained to handle increasingly longer responses. Experiments on benchmarks and human evaluations show that Self-Lengthen outperforms existing methods in long-text generation, when applied to top open-source LLMs such as Qwen2 and LLaMA3. Our code is publicly available at https://github.com/QwenLM/Self-Lengthen.