Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information

Yein Park, Chanwoong Yoon, Jungwoo Park, Minbyul Jeong, Jaewoo Kang

2025-02-21

Does Time Have Its Place? Temporal Heads: Where Language Models Recall
Time-specific Information

Summary

This paper talks about a discovery called Temporal Heads in AI language models. These are special parts of the AI that help it understand and remember information related to time, like historical facts or events that happen in specific years.

What's the problem?

AI language models are good at remembering facts, but we don't really know how they handle information that changes over time. It's like having a smart friend who knows a lot of trivia, but we're not sure how they keep track of which facts belong to which time periods.

What's the solution?

The researchers found special parts in the AI called Temporal Heads. These parts are responsible for processing time-related information. They work across different AI models, though their exact location might change. These Temporal Heads can understand both specific years like '2004' and phrases like 'In the year...'. The researchers also found that they could change the AI's knowledge about time by adjusting these Temporal Heads.

Why it matters?

This matters because it helps us understand how AI thinks about time, which is crucial for making AI that can accurately discuss historical events or keep up with changing information. It could lead to AI assistants that are better at answering questions about different time periods or updating their knowledge without needing to be completely retrained. This discovery opens up new ways to improve AI's understanding of time and make it more useful for tasks that involve historical or time-sensitive information.

Abstract

While the ability of language models to elicit facts has been widely investigated, how they handle temporally changing facts remains underexplored. We discover Temporal Heads, specific attention heads primarily responsible for processing temporal knowledge through circuit analysis. We confirm that these heads are present across multiple models, though their specific locations may vary, and their responses differ depending on the type of knowledge and its corresponding years. Disabling these heads degrades the model's ability to recall time-specific knowledge while maintaining its general capabilities without compromising time-invariant and question-answering performances. Moreover, the heads are activated not only numeric conditions ("In 2004") but also textual aliases ("In the year ..."), indicating that they encode a temporal dimension beyond simple numerical representation. Furthermore, we expand the potential of our findings by demonstrating how temporal knowledge can be edited by adjusting the values of these heads.

View Paper