EfficientLLM: Efficiency in Large Language Models

Zhengqing Yuan, Weixiang Sun, Yixin Liu, Huichi Zhou, Rong Zhou, Yiyang Li, Zheyuan Zhang, Wei Song, Yue Huang, Haolong Jia, Keerthiram Murugesan, Yu Wang, Lifang He, Jianfeng Gao, Lichao Sun, Yanfang Ye

2025-05-20

EfficientLLM: Efficiency in Large Language Models

Summary

This paper talks about EfficientLLM, a project that studies different ways to make large language models work faster and use less computer power without losing their ability to solve problems well.

What's the problem?

The problem is that large language models are very powerful but they often require a lot of time, energy, and expensive hardware to train and use, which makes them hard to run for many people or on different types of tasks.

What's the solution?

To address this, the researchers tested and compared a bunch of different efficiency strategies at each stage of building and using these models, from the early training to the final use. They discovered that the best way to make the models efficient can depend on what kind of task the model is doing, and that some tricks can help the models work well across different types of data, like text and images.

Why it matters?

This matters because finding the best ways to make large language models more efficient means more people can use them, and they can be applied to a wider range of problems, making AI tools more accessible and practical in everyday life.

Abstract

EfficientLLM evaluates efficiency techniques for LLMs across architecture pretraining, fine-tuning, and inference, demonstrating task-dependent optima and cross-modal generalization.

View Paper