Selected Writing
A curated map of my technical writing. Some pieces are paper notes, some are working notes, and some are attempts to turn a mathematical idea into intuition. For the full archive, see the blog.
Start Here
3D worlds from images. Start with 3D Gaussian Splatting, then read pixelNeRF and DreamFusion to see how neural fields, rendering, and generative priors connect.
Transformer internals. Start with A Primer on the Inner Workings of Transformer-Based Language Models, then move to Retrieval Heads and The Geometry of Categorical and Hierarchical Concepts.
Writing to understand. Start with On Writing to Think and The Mathematics and Philosophy Behind MSE.
Collections
3D Vision & Neural Rendering
- 3D Gaussian Splatting for Real-Time Radiance Field Rendering
- DreamFusion: Text-to-3D using 2D Diffusion
- DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
- Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
- pixelNeRF: Neural Radiance Fields from One or Few Images
- SIREN: Implicit Neural Representations with Periodic Activation Functions
- Neural Fields as Learnable Kernels for 3D Reconstruction
- One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
- OpenScene: 3D Scene Understanding with Open Vocabularies
- PointNet and PointNet++
- LION: Latent Point Diffusion Models for 3D Shape Generation
Language Models, Transformers & Alignment
- A Primer on the Inner Workings of Transformer-Based Language Models
- Retrieval Head Mechanistically Explains Long-Context Factuality
- ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
- Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
- The Geometry of Categorical and Hierarchical Concepts in Large Language Models
- Mixture-of-Depths: Dynamically Allocating Compute in Transformer-Based Language Models
- Universal Language Learning Paradigms — UL2
Efficient ML & Model Compression
- ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
- Mixture-of-Depths: Dynamically Allocating Compute in Transformer-Based Language Models
- Effects of Scale on Model Finetuning
Causal, Statistical & Foundation Methods
- The Mathematics and Philosophy Behind MSE
- Randomization Inference When N Equals One
- Meta Statistical Learning
- Targeted Cause Discovery with Data-Driven Learning
- CAASL: Amortized Active Causal Induction with Deep Reinforcement Learning
- TabPFN: Understanding and Advancing Tabular Foundation Models