RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services
Fei Zhao, Chonggang Lu, Haofu Qian, Fangcheng Shi, Zijie Meng, Jianzhao Huang, Xu Tang, Zheyong Xie, Zheyu Ye, Zhe Xu, Yao Hu, Shaosheng Cao
2025-11-11
Summary
This paper introduces RedOne 2.0, a new large language model specifically designed to work well with social media data, like posts and comments.
What's the problem?
Large language models often struggle with the unique characteristics of social media. Social media text is constantly changing with new slang, different languages are used, and the way people communicate online is very different from formal writing. Simply training a model on social media data (supervised fine-tuning) can sometimes make it *better* at understanding social media, but *worse* at understanding everything else, especially if the model isn't very large to begin with. It's a balancing act, and it's hard to get right.
What's the solution?
The researchers developed RedOne 2.0 using a three-step process. First, they let the model explore a lot of social media data to get a basic understanding of how people communicate online and identify where it struggles. Second, they specifically trained the model on the areas where it was weak, but also included a little bit of general data to prevent it from 'forgetting' what it already knew. Finally, they used a reinforcement learning approach, giving the model feedback based on how well it performs on social media tasks, to refine its abilities and make sure everything works together smoothly. This approach is designed to adapt quickly and reliably.
Why it matters?
RedOne 2.0 is important because it shows you can build a powerful language model for social media that doesn't require massive amounts of data or computing power. It performs well compared to larger models and is more stable during training. This means it's a more practical and cost-effective way to create AI that understands and interacts with social media content, which is increasingly important for many applications.
Abstract
As a key medium for human interaction and information exchange, social networking services (SNS) pose unique challenges for large language models (LLMs): heterogeneous workloads, fast-shifting norms and slang, and multilingual, culturally diverse corpora that induce sharp distribution shift. Supervised fine-tuning (SFT) can specialize models but often triggers a ``seesaw'' between in-distribution gains and out-of-distribution robustness, especially for smaller models. To address these challenges, we introduce RedOne 2.0, an SNS-oriented LLM trained with a progressive, RL-prioritized post-training paradigm designed for rapid and stable adaptation. The pipeline consist in three stages: (1) Exploratory Learning on curated SNS corpora to establish initial alignment and identify systematic weaknesses; (2) Targeted Fine-Tuning that selectively applies SFT to the diagnosed gaps while mixing a small fraction of general data to mitigate forgetting; and (3) Refinement Learning that re-applies RL with SNS-centric signals to consolidate improvements and harmonize trade-offs across tasks. Across various tasks spanning three categories, our 4B scale model delivers an average improvements about 2.41 over the 7B sub-optimal baseline. Additionally, RedOne 2.0 achieves average performance lift about 8.74 from the base model with less than half the data required by SFT-centric method RedOne, evidencing superior data efficiency and stability at compact scales. Overall, RedOne 2.0 establishes a competitive, cost-effective baseline for domain-specific LLMs in SNS scenario, advancing capability without sacrificing robustness.