PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, Diyi Yang
2024-09-04

Summary
This paper talks about PrivacyLens, a framework designed to evaluate how well language models understand and follow privacy norms when generating responses.
What's the problem?
As language models are increasingly used in personal communications, like emails and social media posts, it’s important to ensure they respect privacy rules. However, measuring how aware these models are of privacy norms is difficult because privacy situations can be complex and vary widely. Additionally, there aren’t many methods to realistically test these models in real-life scenarios.
What's the solution?
PrivacyLens addresses these challenges by creating a structured way to evaluate privacy awareness in language models. It does this by developing detailed examples (called vignettes) that represent different privacy situations and then tracking how the models behave in these contexts. The study found that even advanced models like GPT-4 and Llama-3-70B often leak sensitive information despite being given instructions to protect privacy. PrivacyLens also allows researchers to explore various scenarios to better understand the risks of privacy breaches.
Why it matters?
This research is important because it highlights the need for language models to be more aware of privacy issues, especially as they are used in sensitive situations. By improving our understanding of how these models handle privacy, we can develop better tools and guidelines to ensure that personal information remains secure.
Abstract
As language models (LMs) are widely utilized in personalized communication scenarios (e.g., sending emails, writing social media posts) and endowed with a certain level of agency, ensuring they act in accordance with the contextual privacy norms becomes increasingly critical. However, quantifying the privacy norm awareness of LMs and the emerging privacy risk in LM-mediated communication is challenging due to (1) the contextual and long-tailed nature of privacy-sensitive cases, and (2) the lack of evaluation approaches that capture realistic application scenarios. To address these challenges, we propose PrivacyLens, a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories, enabling multi-level evaluation of privacy leakage in LM agents' actions. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. Using this dataset, we reveal a discrepancy between LM performance in answering probing questions and their actual behavior when executing user instructions in an agent setup. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions. We also demonstrate the dynamic nature of PrivacyLens by extending each seed into multiple trajectories to red-team LM privacy leakage risk. Dataset and code are available at https://github.com/SALT-NLP/PrivacyLens.