LLMs Can Get "Brain Rot"!
Shuo Xing, Junyuan Hong, Yifan Wang, Runjin Chen, Zhenyu Zhang, Ananth Grama, Zhengzhong Tu, Zhangyang Wang
2025-10-17
Summary
This research investigates whether feeding large language models (LLMs) a lot of low-quality information from the internet, specifically from platforms like Twitter/X, can actually make them 'dumber' over time – a phenomenon the authors call 'Brain Rot'.
What's the problem?
LLMs are constantly being updated with new data from the internet to improve their performance. However, a lot of this data is essentially 'junk' – poorly written, nonsensical, or even harmful content. The question is whether continuously training LLMs on this kind of data degrades their abilities, like reasoning, understanding complex information, and avoiding harmful responses. It wasn't clear if the quality of the data used to train these models actually *caused* a decline in their performance, or if other factors were at play.
What's the solution?
The researchers conducted a series of controlled experiments. They created different datasets from Twitter/X, some with high-quality content and others filled with 'junk' based on how much engagement the posts received and how well-written they were. They then trained several LLMs on these datasets, carefully controlling for other variables like the amount of data and the training process itself. By comparing the performance of the models trained on junk data to those trained on good data, they could see if the junk data actually caused a decline in abilities. They also tested if adding some good data back in could fix the problem.
Why it matters?
This research is important because it shows that the quality of data used to train LLMs is a critical factor in their performance and safety. It suggests that simply feeding LLMs more and more data isn't enough – the data needs to be carefully curated. This means that developers need to think about data quality as a safety issue, and regularly check the 'cognitive health' of deployed LLMs to make sure they aren't becoming less capable or more prone to harmful outputs. It highlights that the internet's popularity doesn't equate to quality, and that's a key factor in this decline.
Abstract
We propose and test the LLM Brain Rot Hypothesis: continual exposure to junk web text induces lasting cognitive decline in large language models (LLMs). To causally isolate data quality, we run controlled experiments on real Twitter/X corpora, constructing junk and reversely controlled datasets via two orthogonal operationalizations: M1 (engagement degree) and M2 (semantic quality), with matched token scale and training operations across conditions. Contrary to the control group, continual pre-training of 4 LLMs on the junk dataset causes non-trivial declines (Hedges' g>0.3) on reasoning, long-context understanding, safety, and inflating "dark traits" (e.g., psychopathy, narcissism). The gradual mixtures of junk and control datasets also yield dose-response cognition decay: for example, under M1, ARC-Challenge with Chain Of Thoughts drops 74.9 rightarrow 57.2 and RULER-CWE 84.4 rightarrow 52.3 as junk ratio rises from 0% to 100%. Error forensics reveal several key insights. First, we identify thought-skipping as the primary lesion: models increasingly truncate or skip reasoning chains, explaining most of the error growth. Second, partial but incomplete healing is observed: scaling instruction tuning and clean data pre-training improve the declined cognition yet cannot restore baseline capability, suggesting persistent representational drift rather than format mismatch. Finally, we discover that the popularity, a non-semantic metric, of a tweet is a better indicator of the Brain Rot effect than the length in M1. Together, the results provide significant, multi-perspective evidence that data quality is a causal driver of LLM capability decay, reframing curation for continual pretraining as a training-time safety problem and motivating routine "cognitive health checks" for deployed LLMs.