MLflow Integration
Export agent traces with human annotations from aimux to MLflow for evaluation, regression detection, and judge calibration.
Prerequisites
Section titled “Prerequisites”- aimux installed
- MLflow 3.6+ (
pip install mlflow>=3.6) - Python 3.10+
Quick Start
Section titled “Quick Start”1. Start MLflow
Section titled “1. Start MLflow”mlflow server --host 127.0.0.1 --port 50002. Configure aimux
Section titled “2. Configure aimux”Add the MLflow endpoint to ~/.aimux/config.yaml:
export: endpoint: "localhost:5000" insecure: true3. Annotate and Export
Section titled “3. Annotate and Export”In aimux, open a trace and annotate turns (g for good, b for bad, w for waste). Add notes with n. Then export with :export-otel.
4. View in MLflow
Section titled “4. View in MLflow”Open http://localhost:5000. Navigate to your experiment’s Traces tab to see session spans with turn-level attributes including token counts, cost, model, and your annotations.
Span Attribute Reference
Section titled “Span Attribute Reference”| Attribute | Type | Description |
|---|---|---|
aimux.session_id | string | Session identifier |
aimux.provider | string | Provider name (claude, codex, gemini) |
aimux.turn.number | int | Turn number within the session |
aimux.feedback.value | string | Human annotation: good, bad, wasteful |
aimux.feedback.rationale | string | Free-text note explaining the label |
gen_ai.request.model | string | Model used for this turn |
gen_ai.usage.input_tokens | int64 | Input token count |
gen_ai.usage.output_tokens | int64 | Output token count |
gen_ai.usage.cost | float64 | Estimated cost in USD |
tool.name | string | Tool/function name (on tool call spans) |
Other OTEL Backends
Section titled “Other OTEL Backends”Works with any OTLP/HTTP-compatible backend (Jaeger, Grafana Tempo, Datadog via OTEL collector). Just change the export.endpoint in your config.