Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

Yun He, Wenzhe Li, Hejia Zhang, Songlin Li, Karishma Mandyam, Sopan Khosla, Yuanhao Xiong, Nanshu Wang, Selina Peng, Beibin Li, Shengjie Bi, Shishir G. Patil, Qi Qi, Shengyu Feng, Julian Katz-Samuels, Richard Yuanzhe Pang, Sujan Gonugondla, Hunter Lang, Yue Yu, Yundi Qian, Maryam Fazel-Zarandi, Licheng Yu

2025-11-14

Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

Summary

This paper focuses on making large language models, like those powering chatbots, better at following complicated instructions, especially those that involve back-and-forth conversation and specific system guidelines.

What's the problem?

Currently, it's hard to truly test how well these models follow complex instructions because there aren't good, standardized tests created by humans. Also, it's difficult to give the models clear feedback on *how* to improve their instruction-following skills – what makes a good response versus a bad one isn't easily quantifiable for training purposes.

What's the solution?

The researchers created a new benchmark called AdvancedIF, which includes over 1,600 challenging prompts and detailed scoring guidelines created by experts. They also developed a training method called RIFL, which uses these scoring guidelines to automatically give the language model feedback and improve its ability to follow instructions through a process similar to reinforcement learning. RIFL essentially teaches the model what a 'good' response looks like based on the rubrics.

Why it matters?

This work is important because it provides both a better way to measure how well AI models follow instructions and a more effective way to train them to do so. By using rubrics, the researchers have created a more reliable and understandable method for improving AI, which could lead to more helpful and trustworthy AI systems in the future.

Abstract

Recent progress in large language models (LLMs) has led to impressive performance on a range of tasks, yet advanced instruction following (IF)-especially for complex, multi-turn, and system-prompted instructions-remains a significant challenge. Rigorous evaluation and effective training for such capabilities are hindered by the lack of high-quality, human-annotated benchmarks and reliable, interpretable reward signals. In this work, we introduce AdvancedIF (we will release this benchmark soon), a comprehensive benchmark featuring over 1,600 prompts and expert-curated rubrics that assess LLMs ability to follow complex, multi-turn, and system-level instructions. We further propose RIFL (Rubric-based Instruction-Following Learning), a novel post-training pipeline that leverages rubric generation, a finetuned rubric verifier, and reward shaping to enable effective reinforcement learning for instruction following. Extensive experiments demonstrate that RIFL substantially improves the instruction-following abilities of LLMs, achieving a 6.7% absolute gain on AdvancedIF and strong results on public benchmarks. Our ablation studies confirm the effectiveness of each component in RIFL. This work establishes rubrics as a powerful tool for both training and evaluating advanced IF in LLMs, paving the way for more capable and reliable AI systems.

View Paper