
Why AI coding tool claims need better measurements
The article says Google, Anthropic, OpenAI, and Cursor often promote how much code is written by AI. The writer argues that these numbers do not show whether teams shipped better products, moved faster, or had fewer problems. It cites several studies and says the productivity evidence for AI coding tools is mixed and still changing. The article concludes that developers should use AI tools, but measure results instead of code volume.
Key points
- The article says AI-written code percentage is a volume number, not a direct measure of value.
- It contrasts today’s claims with earlier GitHub Copilot research that measured task completion speed.
- It says research on AI coding productivity is mixed and keeps changing.
- The writer supports daily AI use, but says teams should measure outcomes with tools like DORA metrics.
- The article suggests asking whether an AI claim measures an outcome or just volume.
Quick term guide
- AI coding tools
- Programs like Claude, Cursor, or ChatGPT that write code for you when you describe what you want in plain language.
- AI coding tool
- Software that uses AI to help write, edit, or explain code.
- developers
- Developers are people who build software, apps, or websites.
- AI tools
- Software that can help create text, code, images, or other work.
- Solo developer
- An individual who handles all parts of creating a project or product alone.
- AI-written code
- Program code produced by an AI tool such as ChatGPT, Claude, Gemini, or Cursor.
- GitHub Copilot
- A popular tool that helps programmers write code using artificial intelligence.
- DORA metrics
- A common set of measures for how fast and safely software teams deliver changes.