Quartet: Native FP4 Training Can Be Optimal for Large Language Models
Roberto L. Castro, Andrei Panferov, Soroush Tabesh, Oliver Sieberling, Jiale Chen, Mahdi Nikdan, Saleh Ashkboos, Dan Alistarh
2025-05-26
Summary
This paper talks about Quartet, a new way to train large language models using a much smaller and faster type of computer number called FP4, which makes training these models cheaper and more efficient.
What's the problem?
The problem is that training big language models usually takes a lot of computing power, time, and money, especially when using standard or even slightly smaller number types like FP8, which can still be expensive.
What's the solution?
The researchers showed that you can use FP4, which is even smaller than FP8, to train these models without losing accuracy. They designed hardware and methods to support this, proving that FP4 can match or even beat the results of traditional training methods while cutting down on costs.
Why it matters?
This is important because it means powerful AI models can be trained much faster and more cheaply, making advanced AI technology more available to everyone and reducing the environmental impact of training these huge models.
Abstract
Quartet, a hardware-supported FP4 training approach for large language models, demonstrates state-of-the-art accuracy while significantly reducing computational costs compared to standard or FP8 precision.