Closed-Loop Learning

OpenSRE captures accuracy feedback after every investigation. When you mark a result as partial or inaccurate, it is classified into a triage taxonomy and recorded as a miss. The opensre misses command surface lets you review trends, track recurrence, and convert top misses into reproducible benchmark scenarios — closing the loop from production usage back into the eval suite.

Quick reference

Command	What it does
`opensre misses list`	Show recent misses with alert, taxonomy, rating, and root cause.
`opensre misses stats`	Taxonomy breakdown plus recurring `(alert, taxonomy)` pairs.
`opensre misses export --out PATH`	Write per-case `alert.json` files the benchmark runner can consume.
`opensre misses convert MISS_ID`	Convert a single miss into a scenario payload (stdout or `--out FILE`).

How a miss is captured

After every investigation the CLI shows the accuracy prompt. If you pick partial or inaccurate you’ll be asked for a short note and a taxonomy bucket:

Retrieval gap — the agent did not fetch the evidence it needed.
Reasoning gap — it had the evidence but drew the wrong conclusion.
Tool failure — a tool errored, timed out, or returned bad data.
Routing/prompt failure — the wrong tools or plan were selected.
Unknown — choose this only when none of the above clearly fit.

The miss is written to ~/.opensre/misses.jsonl and an investigation_miss_classified event is emitted to PostHog with the run provenance, taxonomy, and (when available) user_id / org_id. The original feedback record in ~/.opensre/feedback.jsonl is untouched.

Reviewing trends

# Everything captured in the last week
opensre misses stats --since 7d

# Drill into just the retrieval gaps
opensre misses list --since 14d --taxonomy retrieval_gap

# Machine-readable output for dashboards or pipelines
opensre misses stats --since 30d --json

stats reports the count per taxonomy and the recurring (alert_name, taxonomy) pairs (seen more than once). Recurring pairs are the strongest signal that a regression scenario is overdue.

Converting misses to regressions

opensre misses export writes one scenario per recurring (alert, taxonomy) pair, ordered by how often it has recurred. The output mirrors the existing tests/benchmarks/openrca_scenarios/*/alert.json shape, so the benchmark runner consumes it without any adapter changes:

opensre misses export \
  --since 7d --top 10 \
  --out tests/benchmarks/production_misses/

Each case directory contains an alert.json whose commonAnnotations.scoring_points dict (expected_root_cause, expected_category, miss_notes) carries the rubric for grading — the same location opensre investigate --evaluate already reads from, and the same one strip_scoring_points_from_alert removes before the agent sees the alert. The _meta block carries non-rubric provenance (miss_id, original_run_id, taxonomy). Commit the directory under tests/benchmarks/ and the next benchmark run will include the new regressions.

Weekly triage workflow

Step	Owner	SLA
Run `opensre misses stats --since 7d` and review top recurring pairs	On-call engineer	Monday morning
Run `opensre misses export --since 7d --top 10 --out tests/benchmarks/production_misses/`	On-call engineer	Monday
Open a PR adding the new scenarios with a `benchmark` label	On-call engineer	Tuesday
Run the benchmark workflow against the PR branch	Reviewer	Wednesday
Track fix-rate week-over-week using PostHog `investigation_miss_classified` trends	Eng lead	Ongoing

PostHog dashboards built on investigation_miss_classified (grouped by taxonomy and alert_name) provide the week-over-week trend view referenced by the SLAs.

Privacy

Miss records live entirely on the engineer’s machine in ~/.opensre/misses.jsonl. To delete everything captured locally, remove the file. The investigation_miss_classified PostHog event carries identifiers and structured metadata only:

miss_id, feedback_id, run_id
taxonomy, rating, has_detail (boolean — whether a note was provided, never the note itself)
alert_name, pipeline_name, root_cause_category
Optional user_id, org_id when running on a hosted/JWT path

The free-text note (taxonomy_detail) and the captured root_cause string are never sent to PostHog — they only exist in the local JSONL store, so removing ~/.opensre/misses.jsonl removes them entirely.

​Closed-Loop Learning

​Quick reference

​How a miss is captured

​Reviewing trends

​Converting misses to regressions

​Weekly triage workflow

​Privacy