Skip to content

MLflow Integration

Export agent traces with human annotations from aimux to MLflow for evaluation, regression detection, and judge calibration.

  • aimux installed
  • MLflow 3.6+ (pip install mlflow>=3.6)
  • Python 3.10+
Terminal window
mlflow server --host 127.0.0.1 --port 5000

Add the MLflow endpoint to ~/.aimux/config.yaml:

export:
endpoint: "localhost:5000"
insecure: true

In aimux, open a trace and annotate turns (g for good, b for bad, w for waste). Add notes with n. Then export with :export-otel.

Open http://localhost:5000. Navigate to your experiment’s Traces tab to see session spans with turn-level attributes including token counts, cost, model, and your annotations.

AttributeTypeDescription
aimux.session_idstringSession identifier
aimux.providerstringProvider name (claude, codex, gemini)
aimux.turn.numberintTurn number within the session
aimux.feedback.valuestringHuman annotation: good, bad, wasteful
aimux.feedback.rationalestringFree-text note explaining the label
gen_ai.request.modelstringModel used for this turn
gen_ai.usage.input_tokensint64Input token count
gen_ai.usage.output_tokensint64Output token count
gen_ai.usage.costfloat64Estimated cost in USD
tool.namestringTool/function name (on tool call spans)

Works with any OTLP/HTTP-compatible backend (Jaeger, Grafana Tempo, Datadog via OTEL collector). Just change the export.endpoint in your config.