Welcome to the Learning Mechanics DeCal!

Learning Mechanics is the emerging discipline that treats deep learning the way physics treats the natural world: seeking compact mathematical principles¹, tight connections between theory and experiment, and simple, intuitive explanations for complex phenomena. Pieces of a scientific theory for deep learning are beginning to fit together, and in this course, we will examine what has been assembled so far, what remains contested, and where the field is heading.

Deep learning is among the most powerful technologies humans have ever built, and understanding it promises to be one of the defining intellectual challenges of the early 21st century. As of 2026, the engineering success of deep learning has dramatically outpaced our scientific understanding of it. Closing that gap may amount to founding a genuinely new field of science—one whose implications for our understanding of intelligence, data, and learning extend well beyond the neural networks that motivated it.

Readings draw heavily from the whitepaper There Will Be a Scientific Theory of Deep Learning (Simon et al., 2026) and the primary literature it synthesizes.

Wk 1

First Week — No Class

Wk 2

Lecture 1 Introduction I: Learning Mechanics

What’s the evidence for an emerging scientific theory of deep learning?

Reading: Simon et al. (2026)

Wk 3

Lecture 2 Introduction II: Neural Networks

What exactly are neural networks? Why are they hard to study? How will we study them anyways?

Reading: Nielsen (2019) Lecture Notes Homework: optional math review

Wk 4

Lecture 3 Analytically Solvable Settings I: Deep Linear Networks

What can we learn about deep learning from a highly mathematically tractable toy model in deep linear networks?

Reading: Saxe et al. (2014) Lecture Notes Homework

Wk 5

Lecture 4 Analytically Solvable Settings II + Insightful Limits I: The Neural Tangent Kernel and Kernel Regression

How do neural networks simplify in the infinite-width limit?

Reading: Lee et al. (2019) Optional Reading: Jacot et al. (2020) Lecture Notes

Wk 6

Lecture 5 Analytically Solvable Settings III: Eigenlearning and the HEA

How can we develop a mathematical framework to study kernel regression? Can we predict how kernel regression will perform on real data?

Reading: Simon et al. (2023) Reading: Karkada et al. (2026) Lecture Notes

Wk 7

Lecture 6 Disentangling Hyperparameters I + Insightful Limits II: The Lazy (NTK) and Rich (μP) Regimes

In the lazy (NTK) regime, neural networks don’t learn any structure. Is there a regime where they do?

Reading: Karkada et al. (2024) Optional Reading: Yang et al. (2021) Lecture Notes Homework

Wk 8

Lecture 7 Analytically Solvable Settings IV: Balancedness and Feature Learning

Are there toy models where we can exactly characterize a lazy/rich phase transition?

Reading: Kunin et al. (2024) Lecture Notes Homework

Wk 9

Lecture 8 Universality I: The Platonic Representation Hypothesis

Do deep learning models learn similar representations of data across diverse architectures?

Reading: Huh et al. (2024) Lecture Notes

Wk 10

Thanksgiving Break — No Class

Wk 11

Lecture 10 Universality II: Fourier Features in Learned Representations

What kind of features are learned by language models? How might we characterize where such features come from and how they’re learned?

Reading: Karkada et al. (2026) Lecture Notes Homework

Wk 12

Lecture 11 Empirical Laws I: The Edge of Stability

Why do neural networks routinely train successfully while hovering on the very brink of numerical divergence?

Reading: Damian et al. (2023) Lecture Notes

Wk 13

Buffer Week

Wk 14

Lecture 13 Final Project Hypothesis Presentations

Wk 15

Lecture 14 Final Project Office Hours

Wk 16

RRR Week — No Class

From Wikipedia: “In physics and other sciences, theoretical work is said to be from first principles, or ab initio, if it starts directly at the level of established science and does not make assumptions such as empirical model and parameter fitting. “First principles thinking” consists of decomposing things down to the fundamental axioms in the given arena, before reasoning up by asking which ones are relevant to the question at hand, then cross referencing conclusions based on chosen axioms and making sure conclusions do not violate any fundamental laws.” ↩