An Exploration of Default Images
in Text-to-Image Generation

Hannu Simonen Atte Kiviniemi Hannah Johnston Helena Barranha Jonas Oppenlaender

ACM CHI Conference on Human Factors in Computing Systems 2026

University of Oulu • Carleton University • IST, University of Lisbon & IHA-NOVA FCSH / IN2PAST, Portugal

Paper Dataset arXiv
greawarz
kinkku
kasvi
bcd
greamylt
www.tumblr.com
Tzivani Rhekai
suola
Shelan Creswell
Soraya Lonescu

Figure 1: Default images in Midjourney. Varied, seemingly unrelated prompts lead to visually similar outputs (the "default image"), motivating our exploration of this behavior in text-to-image generation models (Midjourney).

Abstract

In the creative practice of text-to-image (TTI) generation, images are synthesized from textual prompts. By design, TTI models always yield an output, even if the prompt contains unknown terms. In this case, the model may generate default images: images that closely resemble each other across many unrelated prompts. Studying default images is valuable for designing better solutions for prompt engineering and TTI generation.

We present the first investigation into default images on Midjourney. We describe an initial study in which we manually created input prompts triggering default images, and several ablation studies. Building on these, we conduct a computational analysis of over 750,000 images, revealing consistent default images across unrelated prompts. We also conduct an online user study investigating how default images may affect user satisfaction.

Dataset: https://huggingface.co/datasets/tti-dev/default-images

What are Default Images?

Default images are visually similar outputs resulting from dissimilar prompts. They occur when TTI models encounter ambiguous or unknown terms (e.g., words outside the training data) and "collapse" to a specific region in the latent space.

  • Consistent: Stable in terms of motif and style across different seeds.
  • Recurring: The same image can appear for "bata", "aso", or "apoy" (Tagalog words).
  • Model-Specific: Each model version (e.g., Midjourney v5 vs v6) has its own unique set of default images.

Methodology

Manual Investigation

We systematically created 130 prompts across six categories likely to trigger default images:

A1: Rare Names
A2: Corrupted Words
A3: Web Addresses
A4: Low-Resource Langs (Finnish, Tagalog)
A5: Glitch Tokens
A6: Abbreviations

Computational Analysis

We analyzed over 750,000 images collected from Midjourney Discord channels.

  • • Filtered for non-square, upscaled, or varied images.
  • • Used CLIP (ViT-L/14) to compute image embeddings.
  • • Applied hierarchical clustering to find visual similarities.
  • • Filtered clusters by lexical diversity and semantic similarity to isolate default images from merely repetitive prompts.

Key Findings

8 Postulates of Default Images

  • P1 Emergence due to unknown inputs outside training data.
  • P2 Default images are model-specific (Midjourney v5 ≠ v6).
  • P3 Paradox of Rarity & Commonality: Rare in general use, but common in "creative" failures.
  • P4 Vague or ambiguous prompts increase default image likelihood.
  • P5 More common in creative prompts that seek to deviate.
  • P6 Decreasing frequency as models advance and vocabulary grows.
  • P7 Style modifiers alter the appearance but retain the underlying motif.
  • P8 One prompt can produce multiple different default images (stochastic sampling).

Impact on User Satisfaction

Results from our user study (N=48) showing mean satisfaction scores (1-7 scale) across different conditions.

Q1: Subtle Change (Raven → Greagoft) 4.9 / 7
Q3: Exact Match (Control) 4.6 / 7
Q2: Noticeable Change 2.6 / 7
Q4: Default Image (Total Mismatch) 2.4 / 7

Figure 8: Users are significantly dissatisfied when default images appear (Q4) or when the image deviates noticeably from the prompt (Q2).

Canonical Default Images

Through affinity diagramming, we identified specific recurring motifs. These images appear repeatedly for unrelated prompts.

Lady-Birdhead

Lady-Birdhead

Floating-Head

Floating-Head

Psychedelic-Eye

Psychedelic-Eye

Eagle-Circle

Eagle-Circle

Growth-Face

Growth-Face

Animal-Bush

Animal-Bush

Standing-Lady

Standing-Lady

Mirror-Lady

Mirror-Lady

Headpiece-Lady

Headpiece-Lady

Fantasy-Castle

Fantasy-Castle

Figure 4: Set of default images. We assigned descriptive labels to these recurring outputs.

BibTeX

@inproceedings{defaultimages,
  title={An Exploration of Default Images in Text-to-Image Generation},
  author={Simonen, Hannu and Kiviniemi, Atte and Johnston, Hannah and Barranha, Helena and Oppenlaender, Jonas},
  year={2026},
  booktitle={ACM CHI Conference on Human Factors in Computing Systems},
  publisher={ACM},
  address={New York, NY, USA},
  doi={10.1145/3772318.3790681},
  eprint={2505.09166},
  archivePrefix={arXiv},
  url={https://arxiv.org/abs/2505.09166}, 
}