The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models

Shashata Sawmya, Micah Adler, Nir Shavit

2025-05-27

The Birth of Knowledge: Emergent Features across Time, Space, and Scale
in Large Language Models

Summary

This paper talks about how large language models develop and organize their understanding of different ideas and concepts as they get bigger and more complex, and how these features can be tracked and understood over time and across different parts of the model.

What's the problem?

The problem is that it's really hard to see exactly how language models learn and represent knowledge inside their networks, especially since these models get more complicated as they grow. Without understanding this, it's tough to know how or why the models make certain decisions.

What's the solution?

The authors used a special tool called a sparse autoencoder to spot and analyze clear, understandable features inside the models. They studied how these features, which represent different concepts or categories, appear and change as the model trains, grows in size, and processes information through its layers. They also discovered that some features can reappear in different places inside the model.

Why it matters?

This is important because it helps researchers better understand how AI models actually learn and organize knowledge, which can lead to building smarter, more transparent, and more reliable language models in the future.

Abstract

The study examines interpretable categorical features in large language models, using sparse autoencoders to identify semantic concept emergence over time, across layers, and varying sizes, revealing spatial feature reactivation.

View Paper