LLMs and User Experience

LLMs Can't Save Bad UX

I wrote about AI features users actually want a while back. The TLDR was: stop building chatbots, start building smart defaults. That post got shared, people agreed, and then most of them went back to building chatbots. This is the sequel. The one about what happens when AI features ship and nobody uses them. When the LLM is working correctly and the product is still failing. When the problem was never the model. ...

March 18, 2026 · 9 min · Muhammad Hassan Raza
Legal intake fraud scoring cover

Scoring Fraud in Legal Intake Calls

I worked on RISQ, a fraud detection system for legal intake calls. Mass tort law firms receive thousands of calls from potential claimants, and a significant percentage are fraudulent: coached callers reading from scripts, people who never actually used the drug or product in question, or repeat callers under different names. RISQ listens to these calls, scores them for authenticity, and recommends whether to transfer the caller to a closer, flag them for review, or quarantine the call. ...

February 20, 2026 · 5 min · Muhammad Hassan Raza
Extended thinking interaction model diagram

Extended Thinking in LLMs: A Mental Model for Developers

Extended thinking isn’t just “model thinks longer”—it’s a fundamentally different interaction model. If you’re prompting extended thinking models (Claude Opus, o1) the same way you prompt standard models, you’re leaving most of the value on the table. This post is a developer’s mental model for working with these systems: when to use them, how to prompt them, and what trade-offs to expect. How Extended Thinking Actually Works Standard LLMs generate tokens one at a time, each token conditioned on everything before it. The model “thinks” only as fast as it speaks. Ask it to solve a complex problem, and it often commits to an approach in the first few tokens, then rationalizes that approach even if it’s wrong. ...

September 25, 2025 · 7 min · Muhammad Hassan Raza
LangChain production workflow diagram

LangChain in Production: What the Tutorials Don't Tell You

Every LangChain tutorial ends right where the real work begins. You see a neat 50-line script that queries a PDF, and you think, “Cool, I’ll ship this by Friday.” Three weeks later, you’re debugging memory leaks, wondering why your chain silently returns empty strings, and questioning every decision that led you here. I’ve shipped LangChain-based features to production at multiple companies. Here’s what I wish someone had told me before I started. ...

June 20, 2025 · 5 min · Muhammad Hassan Raza
Claude Opus 4.5 context hierarchy diagram

Claude Opus 4.5: When an AI Finally Gets It

I’ve been skeptical of every “game-changing AI release” for the past two years. Every few months, a new model drops and Twitter explodes with claims that AGI is here. Spoiler: it never is. But when Anthropic released Opus 4.5, something actually shifted in how I work. Not because it’s AGI—it’s decidedly not—but because it’s the first model that consistently delivers on complex, multi-step reasoning without falling apart halfway through. This isn’t a hype piece. This is a practitioner’s field notes from someone who uses these tools daily to ship product at Entropy Labs. ...

May 15, 2025 · 5 min · Muhammad Hassan Raza