Release v1.3.9 — Model evaluation SDK & CLI · roboflow/roboflow-python

Model evaluations SDK & CLI

Wraps the public /{workspace}/model-evals REST surface so users can read evaluation results — mAP, confidence sweep, per-class performance, confusion matrix, vector clusters, per-image stats, recommendations — from Python and from the CLI without hitting the API directly.

SDK

Workspace.evals(project=None, version=None, model=None, status=None, limit=None) — list evals as ModelEval instances pre-populated with metadata from the list response.
Workspace.eval(eval_id) — fetch a single eval (returns a ModelEval with .summary populated when status is done).
ModelEval.refresh() — re-fetch the eval header.
ModelEval.map_results(), .confidence_sweep(), .performance_by_class(split=None), .confusion_matrix(split=None, confidence=None), .vector_analysis(confidence=None), .image_predictions(split=None, confidence=None, limit=None, offset=None), .recommendations() — one method per panel; each returns the raw JSON dict.

CLI

roboflow eval list [--project P] [--version V] [--model M] [--status S] [--limit N]
roboflow eval get <eval_id>
roboflow eval map-results <eval_id>
roboflow eval confidence-sweep <eval_id>
roboflow eval performance-by-class <eval_id> [--split S]
roboflow eval confusion-matrix <eval_id> [--split S] [--confidence N]
roboflow eval vector-analysis <eval_id> [--confidence N]
roboflow eval image-predictions <eval_id> [--split S] [--confidence N] [--limit N] [--offset N]
roboflow eval recommendations <eval_id>

Exit codes are stable per error class so shell scripts and AI agents can react without parsing message strings: 3 for model_eval_not_found (404), 4 for model_eval_not_done (409), 5 for invalid_split / invalid_confidence (400). Every command supports --json for structured output.

Low-level (`roboflow.adapters.rfapi`)

list_model_evals, get_model_eval, get_model_eval_map_results, get_model_eval_confidence_sweep, get_model_eval_performance_by_class, get_model_eval_confusion_matrix, get_model_eval_vector_analysis, get_model_eval_image_predictions, get_model_eval_recommendations.
New typed exceptions ModelEvalNotFoundError, ModelEvalNotDoneError, InvalidSplitError, InvalidConfidenceError (all subclasses of RoboflowError) so callers can distinguish "eval doesn't exist" from "eval still running" from "bad argument" without parsing strings.

The endpoints require the model-eval:read scope.

Fixed

rf-detr model upload: accept checkpoints whose args is a plain dict (e.g. EMA checkpoints) when extracting class names, instead of raising TypeError from vars().

Changed

Pin typer<0.26 and declare click explicitly: typer 0.26 vendors its own click and drops the external dependency, which broke the CLI and its type checks.

Full Changelog: v1.3.8...v1.3.9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.3.9 — Model evaluation SDK & CLI

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Model evaluations SDK & CLI

SDK

CLI

Low-level (`roboflow.adapters.rfapi`)

Fixed

Changed

Uh oh!

v1.3.9 — Model evaluation SDK & CLI

Model evaluations SDK & CLI

SDK

CLI

Low-level (roboflow.adapters.rfapi)

Fixed

Changed

Uh oh!

Low-level (`roboflow.adapters.rfapi`)