Harel Lidar

AI Researcher working on representation, world models, and safety

I study how intelligent systems represent, reconstruct, and reason about the world. My writing follows questions around mechanistic interpretability, world models, natural latents, neuroscience, and AI safety.

Questions I Want to Solve

  • How can we make LLM computation visible? From abstract activation spaces to single-token mechanisms to trajectories through a model's state.
  • What should a world model have to represent? Identifying dimensionality and representational requirements without constraining away the bitter lesson.
  • When does the bitter lesson fail? Finding natural latents worth embedding because they produce a net gain rather than brittle human bias.

Recent Posts

View all posts →

Selected Writing

View all selected writing →