
Extended Thinking in LLMs: A Mental Model for Developers
Extended thinking isn’t just “model thinks longer”—it’s a fundamentally different interaction model. If you’re prompting extended thinking models (Claude Opus, o1) the same way you prompt standard models, you’re leaving most of the value on the table. This post is a developer’s mental model for working with these systems: when to use them, how to prompt them, and what trade-offs to expect. How Extended Thinking Actually Works Standard LLMs generate tokens one at a time, each token conditioned on everything before it. The model “thinks” only as fast as it speaks. Ask it to solve a complex problem, and it often commits to an approach in the first few tokens, then rationalizes that approach even if it’s wrong. ...
