InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities

Shuo Cai, Su Lu, Qi Zhou, Kejing Yang, Zhijie Sang, Congkai Xie, Hongxia Yang

2025-08-08

InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs
to Enhance Reasoning Capabilities

Summary

This paper talks about InfiAlign, a new method designed to improve large language models' reasoning skills efficiently by using less data and computing power.

What's the problem?

The problem is that enhancing reasoning in large language models usually requires a lot of data and expensive computing, and existing methods often rely on complicated or task-specific tricks that make scaling difficult.

What's the solution?

The solution was to develop InfiAlign, which combines smart data selection from open sources with a training method that uses supervised fine-tuning and direct preference optimization. This approach focuses on high-quality and diverse training examples to boost reasoning abilities without needing large amounts of data.

Why it matters?

This matters because it provides a practical and scalable way to make AI models smarter at reasoning tasks while saving resources, which makes advanced AI more accessible and useful in real-world applications.

Abstract

InfiAlign, a scalable and sample-efficient post-training framework, combines supervised fine-tuning and Direct Preference Optimization to enhance large language models' reasoning abilities with minimal data and computational cost.

View Paper