Letting code automatically build the info you send to an AI model

Instead of hand-picking what background information to include in an AI prompt, you write code that assembles it automatically based on rules. Selecting only what's relevant keeps token costs down and helps AI agents stay on track across multiple steps.

Every time you ask an AI model something, you also feed it background information — documents, memory, API results — so it can give a useful answer. That bundle is called the context. Today, developers often hard-code which pieces go in, which is tedious and wasteful. The idea here is to make context assembly programmable: you define rules or conditions, and the code decides at runtime which sources to pull, how to format them, and how much to include. This matters most for AI agents that run many steps in a row, because each step needs the right slice of context without ballooning the total size. Smaller, well-chosen context means lower API costs and fewer confused or off-track responses.

Key points

  • Context is the background information you send alongside your question to an AI model.
  • Hand-crafting context is slow and often stuffs in more text than needed, raising costs.
  • Programmable assembly lets code pick only the relevant pieces at the moment they're needed.
  • Trimming unnecessary context directly cuts token usage and therefore API spend.
  • This approach fits naturally into multi-step AI agents and RAG pipelines.

Quick term guide

token costs
Token costs are the fees paid for the text an AI model reads and writes.
token cost
The money or usage spent when sending text to an AI model and getting text back.
AI agents
AI agents are AI tools that can carry out steps toward a goal, not just answer once.
developers
Developers are people who build software, apps, or websites.
API costs
Fees paid when software calls an online service programmatically.
responses
An OpenAI API feature for creating and handling model answers.
RAG pipeline
The full process of splitting documents into chunks, converting them to embeddings, storing them, and searching them at query time.
pipeline
An automated sequence of steps that processes or moves data without manual intervention.
Read original