A collection of notes from college courses, self-study and research. Domains span mathematics, physics, computer science and occasionally philosophy. I hope to rigorously work through ideas, establish connections across disciplines and build a deep understanding of how the world works.

Figure 1: Stars (Yosemite)

🛠️ A Learning Mechanic’s Toolkit

A learning mechanic studies learning mechanics—a dynamical and mechanistic perspective on traditional deep learning theory. This toolkit collects instruments for characterizing important properties and statistics of the training process, hidden representations, and final weights of neural networks.

🔧 Deep Dives

Step-by-step derivations, refined expositions

🔨 Notes

Summaries of important phenomena and models and some useful math

  • The lazy (NTK) and rich (muP) regimes

    • infinite limits · lazy/rich
    • By enforcing stable training criteria on a simple 3-layer linear network, we entirely determine all initialization hyperparameters with a single degree of freedom defined as the richness parameter.
  • When (wide) neural networks become linear

    • infinite limits · neural tangent kernel
    • As the widths of the layers in a neural network become large, the network becomes approximately equal to its first-order (linear) approximation.
  • Quadratic word embedding model (QWEM)

    • exact solutions · feature learning · word embeddings
    • The second-order approximation of the Word2Vec loss yields an equivalent supervised matrix factorization loss. This means we can study a minimal language model through a highly mathematically tractable model in matrix factorization.
  • Maximal stable learning rate derivation

    • optimization phenomena · edge of stability
    • Given a simple and well-behaved loss (constant Hessian), we analytically derive the maximal stable learning rate under gradient descent.
  • Singular values under perturbation

🧮 Math Proofs

Proving cool math theorems

🌱 Exploratory Notes

Raw notes, incomplete thoughts and ongoing learning

Mathematics

Computer Science


Figure 2: Sunset (Mt. Tam)

Margins

A small subset of my many thoughts

Readings that influence how I think

Random thoughts of the more philosophical flavor

Miscellaneous