Harel Lidar

AI Researcher working on representation, world models, and safety

I study how intelligent systems represent, reconstruct, and reason about the world. My writing follows questions around mechanistic interpretability, world models, natural latents, neuroscience, and AI safety.

GitHub LinkedIn Email

Questions I Want to Solve

How can we make LLM computation visible? From abstract activation spaces to single-token mechanisms to trajectories through a model's state.
What should a world model have to represent? Identifying dimensionality and representational requirements without constraining away the bitter lesson.
When does the bitter lesson fail? Finding natural latents worth embedding because they produce a net gain rather than brittle human bias.

Recent Posts

View all posts →

Selected Writing

3D Gaussian Splatting for Real-Time Radiance Field Rendering 3D Vision
A Primer on the Inner Workings of Transformer-Based Language Models Transformers
Is DPO Superior to PPO for LLM Alignment? Alignment
The Mathematics and Philosophy Behind MSE Foundations

View all selected writing →