< Explain other AI papers

Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

Haorui He, Yupeng Li, Bin Benjamin Zhu, Dacheng Wen, Reynold Cheng, Francis C. M. Lau

2025-08-12

Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

Summary

This paper talks about Fact2Fiction, a new method of attack that targets advanced fact-checking systems which use AI agents to break down complex claims into smaller parts, check each part, and then combine the results with explanations. Fact2Fiction uses these explanations and the way the system breaks the information down to create fake evidence that tricks the system into making wrong decisions.

What's the problem?

The problem is that fact-checking systems, designed to fight misinformation, can be vulnerable if attackers understand how they work. These systems break claims into smaller checks and give reasons for their verdicts, but attackers can exploit this process to inject misleading information, causing the system to mistakenly support false claims or reject true ones.

What's the solution?

The researchers designed Fact2Fiction to mimic the fact-checking system’s breakdown process. It creates malicious pieces of evidence for each small part of a claim by studying the system’s explanations. Then, it carefully plans how to spread these fake evidences to confuse the system the most. This targeted approach outperforms older attack methods by tricking the system at a deeper level.

Why it matters?

This matters because fact-checking systems are important tools in stopping the spread of false information online. If attackers can trick these systems easily, misinformation can spread more widely and cause harm. Understanding these attacks helps improve the security of fact-checking AI, making them more reliable and trustworthy for everyone.

Abstract

Fact2Fiction is a poisoning attack framework that targets agentic fact-checking systems by exploiting their decomposition strategy and justifications, achieving higher attack success rates than existing methods.