Evaluating the role of `Constitutions' for learning from AI feedback
Saskia Redgate, Andrew M. Bean, Adam Mahdi
2024-11-19

Summary
This paper explores how different sets of guidelines, called 'constitutions,' can affect the quality of feedback provided by AI models when they help train other AI models, especially in medical interviews.
What's the problem?
As large language models (LLMs) become more advanced, they are often used to provide feedback for training other models. However, the effectiveness of this feedback can vary greatly depending on the guidelines used. Existing methods mainly focus on standard retrieval and answering without considering practical scenarios that are crucial for reliable medical systems. This can lead to poor performance in important areas like patient communication.
What's the solution?
The authors investigate the impact of using four different constitutions on improving patient-centered communication in medical interviews. They conducted comparisons with 215 human raters to evaluate how well each constitution performed. They found that more detailed constitutions generally led to better emotional responses but did not significantly improve practical skills like gathering and providing information. This indicates that while detailed guidelines are beneficial, there are limits to how effective AI feedback can be in certain areas.
Why it matters?
This research is significant because it highlights the importance of having well-defined guidelines for AI feedback, especially in sensitive fields like medicine. By understanding how different constitutions influence AI performance, developers can create better training systems that improve communication and ensure that AI-generated responses are both helpful and appropriate.
Abstract
The growing capabilities of large language models (LLMs) have led to their use as substitutes for human feedback for training and assessing other LLMs. These methods often rely on `constitutions', written guidelines which a critic model uses to provide feedback and improve generations. We investigate how the choice of constitution affects feedback quality by using four different constitutions to improve patient-centered communication in medical interviews. In pairwise comparisons conducted by 215 human raters, we found that detailed constitutions led to better results regarding emotive qualities. However, none of the constitutions outperformed the baseline in learning more practically-oriented skills related to information gathering and provision. Our findings indicate that while detailed constitutions should be prioritised, there are possible limitations to the effectiveness of AI feedback as a reward signal in certain areas.