An analytic theory of creativity in convolutional diffusion models

Kamb, Mason; Ganguli, Surya

Computer Science > Machine Learning

arXiv:2412.20292 (cs)

[Submitted on 28 Dec 2024 (v1), last revised 5 Jun 2025 (this version, v2)]

Title:An analytic theory of creativity in convolutional diffusion models

Authors:Mason Kamb, Surya Ganguli

View PDF HTML (experimental)

Abstract:We obtain an analytic, interpretable and predictive theory of creativity in convolutional diffusion models. Indeed, score-matching diffusion models can generate highly original images that lie far from their training data. However, optimal score-matching theory suggests that these models should only be able to produce memorized training examples. To reconcile this theory-experiment gap, we identify two simple inductive biases, locality and equivariance, that: (1) induce a form of combinatorial creativity by preventing optimal score-matching; (2) result in fully analytic, completely mechanistically interpretable, local score (LS) and equivariant local score (ELS) machines that, (3) after calibrating a single time-dependent hyperparameter can quantitatively predict the outputs of trained convolution only diffusion models (like ResNets and UNets) with high accuracy (median $r^2$ of $0.95, 0.94, 0.94, 0.96$ for our top model on CIFAR10, FashionMNIST, MNIST, and CelebA). Our model reveals a locally consistent patch mosaic mechanism of creativity, in which diffusion models create exponentially many novel images by mixing and matching different local training set patches at different scales and image locations. Our theory also partially predicts the outputs of pre-trained self-attention enabled UNets (median $r^2 \sim 0.77$ on CIFAR10), revealing an intriguing role for attention in carving out semantic coherence from local patch mosaics.

Subjects:	Machine Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)
ACM classes:	I.2.10
Cite as:	arXiv:2412.20292 [cs.LG]
	(or arXiv:2412.20292v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2412.20292

Submission history

From: Mason Kamb [view email]
[v1] Sat, 28 Dec 2024 22:33:29 UTC (10,186 KB)
[v2] Thu, 5 Jun 2025 05:09:27 UTC (12,910 KB)

Computer Science > Machine Learning

Title:An analytic theory of creativity in convolutional diffusion models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:An analytic theory of creativity in convolutional diffusion models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators