Text to Video

Discover and compare the best AI models for text to video generation. Note: This is my personal informal leaderboard. Models are ranked by the completion rate of a series of diverse prompts designed to thoroughly assess performance.

RankCompanyModelScore
ByteDance Seed
94
Alibaba
82
Kuaishou
82
4
OpenAI
79
5
Lightricks
75
6
Kuaishou
73
7
MiniMax
71
8
Kuaishou
69
8
PixVerse
69
10
MiniMax
67
11
Google
Veo 3.1
66
11
Tencent
66
11
Google
66
11
Alibaba
66
15
Meituan
65
15
ByteDance Seed
65
17
Runway
Runway Gen4.5
64
17
Bytedance
64
17
Kuaishou
Kling 2.1
64
17
Luma Labs
64
17
PixVerse
PixVerse V5
64
22
Kuaishou
59
23
PixVerse
56
24
Alibaba
54
25
OpenAI
52
25
KlingAI
52
27
Pika Art
51
28
Vidu
50
28
Tencent
50
30
Genmo
47
31
Luma Labs
44

Methodology

Models are ranked using a series of prompts involving diverse range of challenging tasks. This includes:

  • Prompt adherence and world understanding
  • Motion and physics
  • Character consistency
  • Scene transitions
  • Camera movement and angles
  • Lighting and shadows
  • Text and object generation
  • NSFW capabilities

To prevent manipulation, the prompts are kept confidential and are regularly updated to increase difficulty as models improve. Here is a subset of prompts for your reference:

A man riding a unicycle and juggling red balls
Will Smith eating spaghetti
A princess wearing a glittery white dress. She is running away from a massive red dragon with glowing red eyes. 3D disney pixar style
A gymnast performs a flip on a balance beam
A professor writes "hello" on the chalkboard
A swarm of zombies causing chaos in a shopping mall, shaky camera
A cat roars while looking at its reflection in the mirror but instead sees itself as a lion roaring
A neon sign with the text "Subscribe to my channel". Cyberpunk city at night.