
Rayline routes Claude Code side tasks to cheaper models
Rayline was shared on Hacker News as an LLM gateway for Claude Code. It intercepts Claude Code subagent calls and sends them to different models. The post says a user could run the main agent on Opus, while sending smaller tasks to cheaper cloud models or on-device models.
Key points
- Rayline is described as a Claude Code compatible LLM gateway.
- It can route Claude Code subagent calls to different models.
- The main agent can stay on Opus while smaller tasks use cheaper or on-device models.
- The builder says routing can be set by rules instead of making the agent decide each time.
- It is meant to keep the normal Claude Code workflow, without using a separate setup.
Quick term guide
- LLM gateway
- A middleware layer that sits between your app and multiple AI model providers, so you only need to connect to one place.
- subagent
- A separate Claude instance that handles one specific task at the same time as other subagents, enabling parallel work.
- cloud models
- AI models that run on a company’s remote servers instead of your own machine.
- cloud model
- An AI model that runs on another company's servers and is used over the internet.
- on-device
- Running an AI model directly on your phone or computer instead of sending data to a remote server
- sessions
- Separate work threads or task runs inside a tool.
- local model
- An AI model you run directly on your own computer, with no internet connection or external service needed.
- compatible
- Parts that can work together without causing a mismatch.