
Agenta
Open-source LLMOps platform for prompts, evals, and observability

Agenta is an open-source LLMOps platform for building and operating production-grade LLM applications. It centralizes prompt work, evaluation workflows, and runtime traces so teams can iterate safely and measure quality over time.
Key Features
- Interactive prompt playground to compare prompts and models side-by-side on real test cases
- Prompt and configuration versioning with environments/branching to control changes
- Testset management (including CSV import and capturing production cases) for repeatable experiments
- Automated and human evaluation workflows, including LLM-as-judge and custom evaluators
- Production observability with tracing, latency/usage/cost tracking, and debugging via detailed traces
- Open standards support for tracing via OpenTelemetry-compatible instrumentation
- UI and API parity to support both expert-driven and engineering workflows
Use Cases
- Prompt engineering and regression testing before shipping changes to production
- Evaluating agents and RAG pipelines with automated metrics plus expert review
- Debugging and monitoring production LLM apps to detect failures and performance regressions
Agenta fits teams that need a single source of truth for prompts, evaluations, and traces, combining experimentation and operational monitoring in one platform. It helps reduce trial-and-error iterations by making changes measurable and auditable.
Categories:
Tags:
Tech Stack:
Similar Services

Langfuse
Open-source platform for LLM observability, evals, and prompt management
Langfuse is an open-source LLM engineering platform for tracing, metrics, evaluations, datasets, and prompt management to debug and improve AI applications.

Opik
LLM observability and evaluation platform for traces, tests, and dashboards
Opik is an open-source platform to trace, evaluate, and monitor LLM apps, RAG pipelines, and agent workflows with automated evaluations and production dashboards.
BirdNET-Analyzer
Machine-learning tool for analyzing bird vocalizations in audio
Open-source BirdNET toolkit to batch-process audio recordings and identify bird species from their vocalizations using machine learning models.
OpenTelemetry
Docker
TypeScript
Python