Open-source tool released for testing and tracing AI agents
An engineer at Future AGI posted on Reddit that the company has open-sourced a platform for evaluating and observing AI agent apps. The post says it uses the Apache-2.0 license and can be self-hostable. It says the platform can trace runs in tools like LangChain and LlamaIndex, and can check issues such as factual accuracy, private data, toxic content, jailbreaks, and prompt-injection attacks.
Key points
- Future AGI says it released an AI agent evaluation and tracing platform on GitHub.
- The post says the project uses the Apache-2.0 license and is self-hostable.
- It says tracing is built on OpenTelemetry and can auto-instrument LangChain and LlamaIndex.
- The evaluation SDK includes checks for accuracy, groundedness, toxicity, PII, jailbreaks, and prompt-injection attacks.
- Some deterministic checks can run locally with no network calls, according to the post.
Quick term guide
- open-sourced
- The code has been made public so others can inspect, use, or contribute to it.
- open-source
- Software whose code is shared publicly so others can inspect, use, or change it.
- self-hostable
- Software you install and run on your own server instead of using someone else's cloud service
- prompt-injection
- An attack where text read by the AI includes hidden instructions meant to control it.
- visibility
- How easily people can discover and notice a product.
- OpenTelemetry
- A common toolkit for collecting app and server performance data.
- instrument
- Set up a system so it records useful data about how it is running.
- deterministic
- Giving the same result every time when the input is the same.