Hidden Technical Debt in AI

Jul 17, 2025 / AI

That little black box in the middle is machine learning code.

$Screenshot 2025-07-17 at 8.59.31\u202fAM.png$

I remember reading Google’s 2015 Hidden Technical Debt in ML paper & thinking how little of a machine learning application was actual machine learning.

The vast majority was infrastructure, data management, & operational complexity.

With the dawn of AI, it seemed large language models would subsume these boxes. The promise was simplicity : drop in an LLM & watch it handle everything from customer service to code generation. No more complex pipelines or brittle integrations.

But in building internal applications, we’ve observed a similar dynamic with AI.

$Screenshot 2025-07-17 at 8.56.49\u202fAM.png$

Agents need lots of context, like a human : how is the CRM structured, what do we enter into each field - but input is expensive the Hungry, Hungry AI model.

Reducing cost means writing deterministic software to replace the reasoning of AI.

For example, automating email management means writing tools to create Asana tasks & update the CRM.

As the number of tools increases beyond ten or fifteen tools, tool calling no longer works. Time to spin up a classical machine learning model to select tools.

Then there’s watching the system with observability, evaluating whether it’s performant, & routing to the right model. In addition, there’s a whole category of software around making sure the AI does what it’s supposed to.

Guardrails prevent inappropriate responses. Rate limiting stops costs from spiraling out of control when a system goes haywire.

Information retrieval (RAG - retrieval augmented generation) is essential for any production system. In my email app, I use a LanceDB vector database to find all emails from a particular sender & match their tone.

There are other techniques for knowledge management around graph RAG & specialized vector databases.

More recently, memory has become much more important. The command line interfaces for AI tools save conversation history as markdown files.

When I publish charts, I want the Theory Ventures caption at the bottom right, a particular font, colors, & styles. Those are now all saved within .gemini or .claude files in a series of cascading directories.

The original simplicity of large language models has been subsumed by enterprise-grade production complexity.

This isn’t identical to the previous generation of machine learning systems, but it follows a clear parallel. What appeared to be a simple “AI magic box” turns out to be an iceberg, with most of the engineering work hidden beneath the surface.