LeanContext grows from a VS Code plugin into an MCP server, cutting 4k+ tokens per prompt

LeanContext, a tool that trims down the code context sent to AI assistants, has expanded beyond VS Code into a full MCP server. This means it now works with any AI tool that supports MCP, saving over 4,000 tokens per prompt and lowering both cost and response time.

When you use an AI coding assistant like Claude or Cursor, it reads your code files to understand what you're working on. The more files it reads, the more tokens it uses — and tokens cost money and slow down responses. LeanContext tackles this by compressing the context, sending the AI only what it actually needs for the current task rather than dumping entire files.

Previously, this only worked inside VS Code. With this update, LeanContext now runs as an MCP server, which means Claude Desktop, Cursor, Windsurf, and any other tool that supports MCP can plug into it. The author claims savings of 4,000+ tokens per single prompt, which adds up fast if you're calling the AI dozens of times a day. For solo developers paying for API access, this could be a practical way to cut monthly bills without changing your workflow.

Key points

  • LeanContext expanded from a VS Code-only plugin to an MCP server that works with many AI tools
  • Saves 4,000+ tokens per prompt, reducing both API cost and response latency
  • Compatible with Claude Desktop, Cursor, Windsurf, and any MCP-supporting tool
  • Works by sending the AI only the relevant context instead of full file contents
  • High-frequency API users will feel the most benefit in monthly cost savings

Quick term guide

AI assistant
A software tool that uses artificial intelligence to answer questions or help with tasks.
MCP server
A server that helps AI tools connect to outside services in a standard way.
responses
An OpenAI API feature for creating and handling model answers.
Claude Desktop
Anthropic's computer app for using Claude outside the browser.
developers
Developers are people who build software, apps, or websites.
workflow
A repeatable set of steps for getting a task done.
AI tools
Software that can help create text, code, images, or other work.
latency
The total time you wait from sending a request to getting a complete response.
Read original