Canonical set

Five reference flows

These are reusable autonomy patterns, not five demos. PR review and refund approval carry the flagship examples here because they show both the engineer-legible and business-legible sides of the spectrum.

Stacked semantic review

PR review agent

Best for: merge safety
Worker: LLM + tools
Evaluator: A stacked gate that combines deterministic checks with an LLM judge.
Target: output
Risk: medium / technically reversible, operationally costly
Trajectory: HITL -> HOTL

Read the flow

Selective autonomy classifier

Refund approval agent

Best for: selective autonomy
Worker: Rules, classifier, or LLM depending on the implementation.
Evaluator: A separate escalation classifier decides whether the proposed decision is safe to auto-resolve.
Target: output
Risk: medium / partly reversible
Trajectory: HITL -> autonomous for easy cases only

Read the flow

Non-generative classification pipeline

Document routing

Best for: low-risk classifier autonomy
Worker: A routing classifier with no generative model in the loop.
Evaluator: Inline checks and monitoring gate each routing decision.
Target: input and output
Risk: low / reversible
Trajectory: Full autonomy for the in-distribution bulk, with an out-of-distribution escalation path always open.

Read the flow

Irreversible action gate

Transaction / trade execution gate

Best for: irreversible action control
Worker: Upstream automation proposes the transaction or trade.
Evaluator: Deterministic invariants do the heavy lifting.
Target: output and outcome
Risk: catastrophic / irreversible
Trajectory: Deterministic blocking gate always, with human-in-the-loop control for anything that matters.

Read the flow

Evaluator maturity pipeline

Support response agent

Best for: evaluator maturity
Worker: The LLM that drafts the response.
Evaluator: The evaluator matures from an LLM judge to a distilled classifier.
Target: output
Risk: low-medium / reversible
Trajectory: HITL -> HOTL with sampling

Read the flow