A Survey of Small Language Models

Chien Van Nguyen, Xuan Shen, Ryan Aponte, Yu Xia, Samyadeep Basu, Zhengmian Hu, Jian Chen, Mihir Parmar, Sasidhar Kunapuli, Joe Barrow, Junda Wu, Ashish Singh, Yu Wang, Jiuxiang Gu, Franck Dernoncourt, Nesreen K. Ahmed, Nedim Lipka, Ruiyi Zhang, Xiang Chen, Tong Yu, Sungchul Kim, Hanieh Deilamsalehy

2024-10-29

Summary

This paper discusses a new method called SUPE, which helps reinforcement learning (RL) agents learn how to explore their environments more effectively by using previous experiences that haven't been labeled.

What's the problem?

In reinforcement learning, agents learn by trying different actions and seeing what happens. However, they often struggle to explore their environments efficiently, especially when they have limited labeled data to guide them. Using unlabeled data (previous experiences without specific labels) can be beneficial, but it's challenging to know how to use this data effectively for learning specific tasks.

What's the solution?

The authors introduce SUPE (Skills from Unlabeled Prior data for Exploration), which combines two key ideas: extracting low-level skills from unlabeled data and using these skills to improve online exploration. First, they use a technique called a variational autoencoder (VAE) to identify useful skills from past experiences. Then, they relabel this unlabeled data with optimistic estimates of rewards, transforming it into high-level examples that can help the RL agent learn more efficiently. By leveraging both offline and online data, SUPE enables agents to find rewards faster and learn better overall.

Why it matters?

This research is important because it shows how using unlabeled prior data can significantly enhance the learning process for RL agents. By improving exploration strategies, SUPE can help AI systems solve complex tasks more quickly and efficiently, which is valuable for developing smarter AI applications in various fields.

Abstract

Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device, mobile, edge devices, among many others. In this article, we present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques. We propose a novel taxonomy for categorizing the methods used to optimize SLMs, including model compression, pruning, and quantization techniques. We summarize the benchmark datasets that are useful for benchmarking SLMs along with the evaluation metrics commonly used. Additionally, we highlight key open challenges that remain to be addressed. Our survey aims to serve as a valuable resource for researchers and practitioners interested in developing and deploying small yet efficient language models.

View Paper