Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities

Zhiwei Hao, Jianyuan Guo, Li Shen, Yong Luo, Han Hu, Guoxia Wang, Dianhai Yu, Yonggang Wen, Dacheng Tao

2025-05-06

Low-Precision Training of Large Language Models: Methods, Challenges,
and Opportunities

Summary

This paper talks about different ways to train large language models using less precise math, which can make the training process faster and use less computer power.

What's the problem?

Training big AI models usually takes a lot of time and resources because they use very detailed calculations, which can be expensive and hard for smaller organizations to manage.

What's the solution?

The researchers reviewed and explained several techniques that use simpler math formats, like fixed-point and floating-point numbers, and methods that help the AI still learn well even with less precision, making the whole process more efficient.

Why it matters?

This matters because it can make advanced AI technology more affordable and accessible, allowing more people and companies to train and use powerful language models without needing super expensive computers.

Abstract

This survey reviews low-precision training techniques for large language models, categorizing them into fixed-point, floating-point, and customized format methods, and discusses quantization-aware training.

View Paper