I’m a researcher at Anthropic, working on making LLMs better at agentic coding. I’ve worked on Claude models since Opus 4, notably Opus 4.5, 4.6, 4.7, 4.8, and Mythos.

Before this, I worked on reinforcement learning at Google DeepMind. I co-led AlphaProof, where we got an LLM to teach itself enough math to get a silver medal at the International Mathematical Olympiad, almost cracking the IMO grand challenge. This work was published in Nature. I also worked on post-training Gemini.

Before DeepMind, I worked on improving Google’s search ranking algorithm with deep learning.

I live in San Francisco. You can find me on X (Twitter), or email me at rishicomplex@gmail.com.