Text to Video

Discover and compare the best AI models for text to video generation. Note: This is my personal non-scientific leaderboard. Models are ranked by the completion rate of a series of diverse prompts designed to thoroughly assess performance.

RankCompanyModelScore
MiniMax
74.21
Google
73
Alibaba
72.9
4
ByteDance Seed
72.14
5
Bytedance
71.11
6
Kuaishou
Kling 2.1
71.02
7
Luma Labs
70.8
8
PixVerse
PixVerse V5
70.76
9
Kuaishou
65.81
10
PixVerse
62.89
11
Alibaba
61.11
12
OpenAI
59.08
13
KlingAI
58.99
14
Pika Art
57.79
15
Vidu
57.37
16
Tencent
56.82
17
Genmo
54.38
18
Luma Labs
51.2

Full tutorial & review videos

Watch the videos below for comprehensive comparisons and detailed installation guides for select video generation models.

Methodology

Models are ranked using a series of prompts involving diverse range of challenging tasks. This includes:

  • Prompt adherence and world understanding
  • Motion and physics
  • Character consistency
  • Scene transitions
  • Camera movement and angles
  • Lighting and shadows
  • Text and object generation
  • NSFW capabilities

To prevent manipulation, the prompts are kept confidential and are regularly updated to increase difficulty as models improve. Here is a subset of prompts for your reference:

A man riding a unicycle and juggling red balls
Will Smith eating spaghetti
A princess wearing a glittery white dress. She is running away from a massive red dragon with glowing red eyes. 3D disney pixar style
A gymnast performs a flip on a balance beam
A professor writes "hello" on the chalkboard
A swarm of zombies causing chaos in a shopping mall, shaky camera
A cat roars while looking at its reflection in the mirror but instead sees itself as a lion roaring
A neon sign with the text "Subscribe to my channel". Cyberpunk city at night.