GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers

Sarkar Snigdha Sarathi Das, Ryo Kamoi, Bo Pang, Yusen Zhang, Caiming Xiong, Rui Zhang

2024-12-16

GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers

Summary

This paper talks about GReaTer, a new method that helps smaller language models optimize their prompts more effectively by using gradient information from task-specific reasoning.

What's the problem?

Smaller language models often struggle to generate high-quality prompts because they rely on larger models for feedback, which can be expensive and inefficient. Current methods focus only on text-based feedback, missing out on valuable information that could help improve prompt design.

What's the solution?

GReaTer introduces a technique that uses gradients, which are mathematical tools that show how changes in input affect the output, to optimize prompts directly. By analyzing task loss gradients, GReaTer allows smaller models to self-optimize their prompts without needing large models. This method enhances the performance of smaller models across various tasks while maintaining efficiency.

Why it matters?

This research is important because it enables smaller language models to perform at levels comparable to larger ones without the same computational costs. By improving how these models generate and refine prompts, GReaTer can make AI technologies more accessible and effective for a wider range of applications, from education to customer service.

Abstract

The effectiveness of large language models (LLMs) is closely tied to the design of prompts, making prompt optimization essential for enhancing their performance across a wide range of tasks. Many existing approaches to automating prompt engineering rely exclusively on textual feedback, refining prompts based solely on inference errors identified by large, computationally expensive LLMs. Unfortunately, smaller models struggle to generate high-quality feedback, resulting in complete dependence on large LLM judgment. Moreover, these methods fail to leverage more direct and finer-grained information, such as gradients, due to operating purely in text space. To this end, we introduce GReaTer, a novel prompt optimization technique that directly incorporates gradient information over task-specific reasoning. By utilizing task loss gradients, GReaTer enables self-optimization of prompts for open-source, lightweight language models without the need for costly closed-source LLMs. This allows high-performance prompt optimization without dependence on massive LLMs, closing the gap between smaller models and the sophisticated reasoning often needed for prompt refinement. Extensive evaluations across diverse reasoning tasks including BBH, GSM8k, and FOLIO demonstrate that GReaTer consistently outperforms previous state-of-the-art prompt optimization methods, even those reliant on powerful LLMs. Additionally, GReaTer-optimized prompts frequently exhibit better transferability and, in some cases, boost task performance to levels comparable to or surpassing those achieved by larger language models, highlighting the effectiveness of prompt optimization guided by gradients over reasoning. Code of GReaTer is available at https://github.com/psunlpgroup/GreaTer.

View Paper