Open SourceImportance: Medium

Match the AI model to the task type to cut costs without losing quality

r/MachineLearningJun 11, 2026 · 8h ago

A small experiment shows that routing tasks to different AI models based on whether the answer can be verified leads to better cost efficiency. The test covered 120 tasks across 3 models, inspired by a framework from AI researcher Andrej Karpathy.

The core idea comes from a framework proposed by Andrej Karpathy. It centers on a concept called verifiability: some tasks have a clear right or wrong answer — like math problems or code tests — while others are open-ended, like writing an essay or giving strategic advice. For verifiable tasks, you can use a cheaper model and simply retry if it gets it wrong. For non-verifiable tasks, you need a stronger model from the start because there's no easy way to catch a bad answer.

The experiment first classified each task into one of these two categories, then automatically sent it to the appropriate model. Across 120 sample tasks and 3 models, this routing approach was more cost-effective than always using the most powerful model. The practical takeaway for AI agent builders: assign each step of your pipeline to the cheapest model that can reliably handle it, and reserve expensive models only where judgment truly matters.

Key points

Decide which AI model to use based on whether the task has a checkable right answer
Easy-to-verify tasks (math, code, classification) → use a cheaper model; retry if wrong
Open-ended tasks (writing, strategy) → use a stronger model from the start
Small-scale test (120 tasks, 3 models) confirmed cost savings with this approach
Useful for multi-step AI agents: assign each step to the most cost-efficient model that fits

Quick term guide

AI models: The core brain or underlying program that powers an artificial intelligence tool.
AI model: A program that can understand prompts and produce text, code, or answers.
AI Mode: A Google Search feature that uses AI to answer longer, more detailed questions.
framework: A ready-made structure or toolkit that helps developers build software faster.
verifiability: How easily you can check whether an AI's answer is correct or not.
AI agent: An AI program that can inspect information and suggest what to do next.
pipeline: An automated sequence of steps that processes or moves data without manual intervention.
AI agents: AI agents are AI tools that can carry out steps toward a goal, not just answer once.

Read original ↗