Open SourceImportance: High

Token waste is the new cloud waste for AI costs

r/ycombinatorJun 10, 2026 · 2h ago

Using more AI 'tokens' (the units of text AI reads and writes) than needed is turning into a serious cost problem — much like companies once wasted money on idle cloud servers. As AI usage scales up, the waste compounds fast.

In the early 2010s, companies routinely left cloud servers running when they didn't need to, burning money without realizing it. The same pattern is now emerging with AI: developers send overly long instructions, stuff prompts with irrelevant context, or call AI repeatedly for the same task, all of which burn tokens unnecessarily.

AI agents — programs that automatically carry out multi-step tasks — make this especially bad because they re-read the entire conversation history at every step, causing token use to balloon quickly. The post argues that techniques like trimming prompts, removing unnecessary context, and caching (reusing previously processed content) can cut costs dramatically, and that tracking token usage should become a standard operational practice just like monitoring cloud spend.

Key points

Tokens are the units AI charges by — every word in and out costs tokens
AI agents re-read full conversation history each step, causing costs to grow fast
Shorter, focused prompts directly reduce token use and cost
Caching lets you reuse processed content instead of paying to process it again
Monitoring token usage is becoming as important as tracking cloud server costs

Quick term guide

cloud servers: Powerful computers owned by large companies that run programs over the internet.
server: A computer that stores files and shares them with other devices in your home.
developers: Developers are people who build software, apps, or websites.
prompts: Instructions you give to an AI tool.
context: The information an AI uses to understand your request, such as files, notes, and past messages.
AI agents: AI agents are AI tools that can carry out steps toward a goal, not just answer once.
AI agent: An AI program that can inspect information and suggest what to do next.
caching: Saving an AI's response so you can reuse it later without sending the same request again.

Read original ↗