A question about splitting LLM work across personal devices
The writer says they have several devices, including a MacBook, PC, and iPhone, and want to distribute LLM inference across them. They mention two routers, vllmAthena and llmproxy, and ask which one might fit their zero trust infrastructure. They also ask for an open source option like Tailscale that does not require sending all keys to the vendor.
Key points
- The writer wants to split LLM inference across a MacBook, PC, and iPhone.
- They are comparing vllmAthena and llmproxy as possible routers.
- They describe vllmAthena as using small models to choose routing paths.
- They want the setup to fit a zero trust infrastructure.
- They also want an open source Tailscale-like option that does not require trusting a vendor with all keys.
Quick term guide
- LLM inference
- The process where an already trained AI model reads input and generates an answer.
- zero trust infrastructure
- A security setup that checks access every time instead of assuming anything is trusted.
- infrastructure
- The technical systems that keep a website or app running.
- open source
- Software whose code is available for people to view and often modify.
- Tailscale
- A tool that lets you securely access your home devices over the internet, as if they were on the same local network
- Architecture
- The overall structure and organization of a software project.
- AI agents
- AI agents are AI tools that can carry out steps toward a goal, not just answer once.
- model routing
- The practice of sending tasks to different LLMs based on their complexity and cost.