about
I’m a researcher at Anthropic, working on making LLMs better at agentic coding. I’ve worked on Claude models since Opus 4, notably Opus 4.5, 4.6, 4.7, 4.8, Mythos Preview, and Fable 5.
Before this, I worked on reinforcement learning at Google DeepMind. I co-led AlphaProof, where we got an LLM to teach itself enough math to get a silver medal at the International Mathematical Olympiad, almost cracking the IMO grand challenge. This work was published in Nature. I also worked on post-training Gemini.
Before DeepMind, I worked on improving Google’s search ranking algorithm with deep learning.
I live in San Francisco. You can find me on X (Twitter), or email me at rishicomplex@gmail.com.