Blog

Technical deep-dives into AI Engineering, LLMs in production, and lessons learned building real systems.

AI EngineeringLLMEvalsPrompt Engineering

Eval-Driven LLM Systems: How I Improved Accuracy by 34% Without Changing the Model

53% to 87% accuracy — without touching the model. This is the story of building an eval-driven LLM pipeline with prompt versioning, a golden dataset, and Braintrust. And why the cheaper model tied the expensive one.

March 2026•10 min read

AI EngineeringLLMProductionLegalTech

LLMs Aren't Magic: Lessons From Taking a Legal AI System From Chaos to Production

Five months in production. Over 3,000 court notifications processed. 99.22% success rate. This is the story of taking a Legal AI system from chaos to production.

November 2025•15 min read

Product ManagementScopingSprint PlanningPrototype

From Ambiguity to Sprint: How I Turn a Scattered Founder Conversation Into Something Buildable

A founder with an Excel spreadsheet and a fuzzy vision. One full-stack engineer. Two weeks. This is the story of how I turned a scattered intake conversation into an executable Sprint 1 — and why what I didn't build matters as much as what I did.

March 2026•8 min read