Text to Image

Discover and compare the best AI models for text to image generation. Note: This is my personal non-scientific leaderboard. Models are ranked by the completion rate of a series of diverse prompts designed to thoroughly assess performance.

RankCompanyModelScore
ByteDance Seed
86.35
Google
85.63
OpenAI
GPT-4o
84.45
4
Alibaba
84.04
5
Alibaba
83.35
6
Google
83.1
7
ByteDance Seed
82.35
8
Reve
Reve Image (Halfmoon)
76.52
9
Recraft
75.46
10
Ideogram
71.07
11
HiDream
70.23
12
Black Forest Labs
68.88
13
Black Forest Labs
62.85
14
Midjourney
Midjourney v7 Alpha
50.67
15
Stability
49.08

Full tutorial & review videos

Watch the videos below for comprehensive comparisons and detailed installation guides for select text-to-image models.

Methodology

Models are ranked using a series of prompts involving diverse range of challenging tasks. This includes:

  • Prompt adherence and understanding
  • Human anatomy
  • Generating text
  • Diagrams and infographics
  • World understanding
  • Uncommon poses and expressions
  • Spatial understanding
  • NSFW capabilities

To prevent manipulation, the prompts are kept confidential and are regularly updated to increase difficulty as models improve. Here is a subset of prompts for your reference:

A page of a school yearbook with a grid of student photos
A pair of spectral tarsiers on a tree. realistic photo
A ballerina in a tutu practices spins in a sunlit studio with mirrored walls and barre equipment, scattered with pointe shoes and sheet music. A rabbit watches from atop a grand piano. Outside the large window, an elephant balances on a circus ball.
A screenshot of a YouTube video search for "funniest cats"
A woman sitting and showing her palms and soles of feet
A multi-panel comic of a man explaining a simple home workout routine. In each panel, he should describe a different exercise or fitness tip
A red Ferrari Portofino M, a white Audi R8, and a blue 1994 Honda Civic in the desert