Open-source tool released for testing and tracing AI agents

An engineer at Future AGI posted on Reddit that the company has open-sourced a platform for evaluating and observing AI agent apps. The post says it uses the Apache-2.0 license and can be self-hostable. It says the platform can trace runs in tools like LangChain and LlamaIndex, and can check issues such as factual accuracy, private data, toxic content, jailbreaks, and prompt-injection attacks.

Key points

  • Future AGI says it released an AI agent evaluation and tracing platform on GitHub.
  • The post says the project uses the Apache-2.0 license and is self-hostable.
  • It says tracing is built on OpenTelemetry and can auto-instrument LangChain and LlamaIndex.
  • The evaluation SDK includes checks for accuracy, groundedness, toxicity, PII, jailbreaks, and prompt-injection attacks.
  • Some deterministic checks can run locally with no network calls, according to the post.

Quick term guide

open-sourced
The code has been made public so others can inspect, use, or contribute to it.
open-source
Software whose code is shared publicly so others can inspect, use, or change it.
self-hostable
Software you install and run on your own server instead of using someone else's cloud service
prompt-injection
An attack where text read by the AI includes hidden instructions meant to control it.
visibility
How easily people can discover and notice a product.
OpenTelemetry
A common toolkit for collecting app and server performance data.
instrument
Set up a system so it records useful data about how it is running.
deterministic
Giving the same result every time when the input is the same.
Read original