Ruby Teaming: Improving Quality Diversity Search with Memory for Automated Red Teaming
Vernon Toh Yan Han, Rishabh Bhardwaj, Soujanya Poria
2024-06-24

Summary
This paper introduces Ruby Teaming, a new method that enhances a technique called Rainbow Teaming by adding a memory feature. This memory helps improve the quality of prompts used in automated red teaming, which is a way to test the security of systems by simulating attacks.
What's the problem?
In automated red teaming, it can be challenging to create effective prompts that lead to successful attacks on systems. Existing methods like Rainbow Teaming do not utilize past experiences effectively, which can limit their success rates and the diversity of attack strategies.
What's the solution?
Ruby Teaming addresses this issue by incorporating a memory cache that stores information about previous successful prompts and their effectiveness. This memory allows the system to generate better prompts based on past performance. The results showed that Ruby Teaming achieved an attack success rate (ASR) of 74%, which is 20% higher than the previous method. Additionally, it improved the diversity of prompts, as measured by specific indices that assess how varied the generated prompts are.
Why it matters?
This research is important because it demonstrates how adding a memory component can significantly enhance the effectiveness of automated security testing methods. By improving both the success rate and diversity of attack strategies, Ruby Teaming can help organizations better prepare for potential cybersecurity threats.
Abstract
We propose Ruby Teaming, a method that improves on Rainbow Teaming by including a memory cache as its third dimension. The memory dimension provides cues to the mutator to yield better-quality prompts, both in terms of attack success rate (ASR) and quality diversity. The prompt archive generated by Ruby Teaming has an ASR of 74%, which is 20% higher than the baseline. In terms of quality diversity, Ruby Teaming outperforms Rainbow Teaming by 6% and 3% on Shannon's Evenness Index (SEI) and Simpson's Diversity Index (SDI), respectively.