FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading

Guojun Xiong, Zhiyang Deng, Keyi Wang, Yupeng Cao, Haohang Li, Yangyang Yu, Xueqing Peng, Mingquan Lin, Kaleb E Smith, Xiao-Yang Liu, Jimin Huang, Sophia Ananiadou, Qianqian Xie

2025-02-19

FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning
for Financial Trading

Summary

This paper talks about FLAG-Trader, a new AI system that combines large language models (LLMs) with reinforcement learning to make better decisions in stock trading. It's like teaching a super-smart computer to understand financial information and learn from its trading experiences to get better over time.

What's the problem?

While LLMs are really good at understanding and reasoning about financial information, they struggle when it comes to making a series of decisions in real-world trading situations. It's like having a financial expert who knows a lot but doesn't know how to apply that knowledge to make money in the stock market consistently.

What's the solution?

The researchers created FLAG-Trader, which takes an LLM and partially retrains it to understand trading better. They then combined this with reinforcement learning, which is a way for the AI to learn from its successes and failures in trading. This combination allows the AI to use its broad knowledge of finance while also learning specific trading strategies. They made sure to do this in a way that doesn't require too much computing power, making it more practical to use.

Why it matters?

This matters because it could lead to more effective AI-driven trading systems that can make smarter decisions in the stock market. These systems could potentially outperform human traders and traditional trading algorithms, which could change how financial markets operate. It also shows a way to make LLMs better at specific tasks without losing their general knowledge, which could be useful in many other fields beyond just finance.

Abstract

Large language models (LLMs) fine-tuned on multimodal financial data have demonstrated impressive reasoning capabilities in various financial tasks. However, they often struggle with multi-step, goal-oriented scenarios in interactive financial markets, such as trading, where complex agentic approaches are required to improve decision-making. To address this, we propose FLAG-Trader, a unified architecture integrating linguistic processing (via LLMs) with gradient-driven reinforcement learning (RL) policy optimization, in which a partially fine-tuned LLM acts as the policy network, leveraging pre-trained knowledge while adapting to the financial domain through parameter-efficient fine-tuning. Through policy gradient optimization driven by trading rewards, our framework not only enhances LLM performance in trading but also improves results on other financial-domain tasks. We present extensive empirical evidence to validate these enhancements.

View Paper