Direct Preference Knowledge Distillation for Large Language Models
Yixing Li, Yuxian Gu, Li Dong, Dequan Wang, Yu Cheng, Furu Wei
2024-07-01

Summary
This paper talks about Direct Preference Knowledge Distillation (DPKD), a new method for improving how large language models (LLMs) learn from each other. It focuses on transferring knowledge from a more advanced 'teacher' model to a simpler 'student' model more effectively.
What's the problem?
Knowledge Distillation (KD) is a common technique used to help smaller models learn from larger, more complex models. However, traditional methods have some problems, such as being inefficient and not measuring the learning process accurately. This can make it hard for the student model to learn effectively from the teacher model, leading to poorer performance.
What's the solution?
To solve these issues, the authors developed DPKD, which introduces a new way to measure how well the student model is learning from the teacher. They use something called 'distribution divergence' to represent how much the student’s outputs differ from the teacher’s preferred outputs. DPKD works in two stages: first, it optimizes a goal that includes an implicit reward system and a specific type of divergence measurement. Then, it improves the likelihood that the student model produces outputs that are similar to those of the teacher model. The authors tested DPKD on various datasets with different model sizes and found that it significantly improved the performance of the student models compared to traditional methods.
Why it matters?
This research is important because it enhances the process of training smaller language models by making them learn more effectively from larger models. By improving how knowledge is transferred between models, DPKD can lead to better performance in applications where smaller models are used, such as in mobile devices or other environments where computational resources are limited. This could ultimately make AI systems more efficient and accessible.
Abstract
In the field of large language models (LLMs), Knowledge Distillation (KD) is a critical technique for transferring capabilities from teacher models to student models. However, existing KD methods face limitations and challenges in distillation of LLMs, including efficiency and insufficient measurement capabilities of traditional KL divergence. It is shown that LLMs can serve as an implicit reward function, which we define as a supplement to KL divergence. In this work, we propose Direct Preference Knowledge Distillation (DPKD) for LLMs. DPKD utilizes distribution divergence to represent the preference loss and implicit reward function. We re-formulate KD of LLMs into two stages: first optimizing and objective consisting of implicit reward and reverse KL divergence and then improving the preference probability of teacher outputs over student outputs. We conducted experiments and analysis on various datasets with LLM parameters ranging from 120M to 13B and demonstrate the broad applicability and effectiveness of our DPKD approach. Meanwhile, we prove the value and effectiveness of the introduced implicit reward and output preference in KD through experiments and theoretical analysis. The DPKD method outperforms the baseline method in both output response precision and exact match percentage. Code and data are available at https://aka.ms/dpkd.