Report says only 11% of production AI agents passed security tests
The AIRQ 2026 Q2 report says it assessed 100 production AI agents. The Reddit post says only 11% passed its security thresholds. It also says 98% had private data access, untrusted content input, and outbound actions together. The poster says they are the author and will answer questions in the comments.
Key points
- The report says it tested 100 production AI agents.
- Only 11% passed the stated security thresholds.
- The post says 98% combined private data access, untrusted content input, and outbound actions.
- Coding agents ranked high in capability but low in defense, according to the post.
- The post says 83% of claimed defenses had no independent verification.
Quick term guide
- production
- The live version of a service that real users use.
- AI agents
- AI agents are AI tools that can carry out steps toward a goal, not just answer once.
- outbound actions
- Actions an AI system can take outside itself, such as sending data or using another service.
- automation
- A way to make repeated work happen without doing every step by hand.
- benchmark
- A test used to compare speed, quality, or cost.
- independent verification
- A check done by someone other than the maker or seller.
- coding agents
- AI programs designed to autonomously perform tasks like writing or fixing code.
- coding agent
- An AI tool that writes or edits code from a person’s instructions.