Text to Video
Discover and compare the best AI models for text to video generation. Note: This is my personal non-scientific leaderboard. Models are ranked by the completion rate of a series of diverse prompts designed to thoroughly assess performance.
| Rank | Company | Model | Score |
|---|---|---|---|
MiniMax | 74.21 | ||
Google | 73 | ||
Alibaba | 72.9 | ||
4 | ByteDance Seed | 72.14 | |
5 | Bytedance | 71.11 | |
6 | Kuaishou | Kling 2.1 | 71.02 |
7 | Luma Labs | 70.8 | |
8 | PixVerse | PixVerse V5 | 70.76 |
9 | Kuaishou | 65.81 | |
10 | PixVerse | 62.89 | |
11 | Alibaba | 61.11 | |
12 | OpenAI | 59.08 | |
13 | KlingAI | 58.99 | |
14 | Pika Art | 57.79 | |
15 | Vidu | 57.37 | |
16 | Tencent | 56.82 | |
17 | Genmo | 54.38 | |
18 | Luma Labs | 51.2 |
Full tutorial & review videos
Watch the videos below for comprehensive comparisons and detailed installation guides for select video generation models.
Methodology
Models are ranked using a series of prompts involving diverse range of challenging tasks. This includes:
- Prompt adherence and world understanding
- Motion and physics
- Character consistency
- Scene transitions
- Camera movement and angles
- Lighting and shadows
- Text and object generation
- NSFW capabilities
To prevent manipulation, the prompts are kept confidential and are regularly updated to increase difficulty as models improve. Here is a subset of prompts for your reference:
A man riding a unicycle and juggling red balls
Will Smith eating spaghetti
A princess wearing a glittery white dress. She is running away from a massive red dragon with glowing red eyes. 3D disney pixar style
A gymnast performs a flip on a balance beam
A professor writes "hello" on the chalkboard
A swarm of zombies causing chaos in a shopping mall, shaky camera
A cat roars while looking at its reflection in the mirror but instead sees itself as a lion roaring
A neon sign with the text "Subscribe to my channel". Cyberpunk city at night.








