Fast and Simplex: 2-Simplicial Attention in Triton

Aurko Roy, Timothy Chou, Sai Surya Duvvuri, Sijia Chen, Jiecao Yu, Xiaodong Wang, Manzil Zaheer, Rohan Anil

2025-07-04

Fast and Simplex: 2-Simplicial Attention in Triton

Summary

This paper talks about the 2-simplicial Transformer, a new version of the Transformer AI model that improves how it pays attention to information by looking at triplets of tokens instead of pairs. This helps the model understand more complex relationships and reason better.

What's the problem?

The problem is that standard Transformers only consider pairs of words or tokens when deciding what to focus on, which can limit their ability to capture deeper or more complicated connections needed for tasks like reasoning and knowledge understanding, especially when there is a limit on how many tokens they can use.

What's the solution?

The researchers created the 2-simplicial Transformer, which uses a special mathematical idea called '2-simplicial attention' to consider interactions between three tokens at once. This approach keeps computation efficient by using virtual tokens and smart projection methods, allowing the model to learn higher-order relationships without extra cost.

Why it matters?

This matters because it makes AI models better at thinking through complex ideas and solving problems more efficiently, especially when working with limited data. It is useful for improving large language models in fields that need strong reasoning and knowledge processing.

Abstract

The 2-simplicial Transformer improves token efficiency over standard Transformers, offering better performance on knowledge and reasoning tasks with a fixed token budget.

View Paper