Canonical flow

Refund approval agent

Selective autonomy for clear-cut refund decisions.

Abstract pattern

Selective autonomy classifier

The business-legible flagship showing how a separate escalation classifier earns autonomy only for the easy cases while hard refunds stay human-gated.

Assisted Human-led with automation support.
HITL Human approves each action.
HOTL Human samples or monitors.
Autonomous Automation acts within guardrails.

Worker: Rules, classifier, or LLM depending on the implementation.
Boundary: Input is the refund request, including order details, reason, amount, and customer history. Output is approve, deny, or escalate with rationale.
Evidence log: Request features, worker decision, evaluator verdict, human decision when escalated, and downstream reversal, chargeback, or complaint signals.
Evaluator: A separate escalation classifier decides whether the proposed decision is safe to auto-resolve.
Promotion rule: Escalation-classifier recall on should-escalate cases stays above threshold, which is enough to auto-decide clear cases without trusting the hard ones.
Demotion rule: Chargeback or complaint rates on auto-approved refunds rise, or feature drift appears, which widens escalation and pushes the flow back toward human review.
Fallback: Ambiguous and high-value refunds stay gated indefinitely.
Lives: HITL -> autonomous for easy cases only

Evaluator detail

What the gate actually checks

Target: Output
Technique: Learned escalation classifier trained on accumulated human decisions and tuned for escalation recall.
Oracle: Human approvals and denials provide the labels for training and validation.
Position: hitl

Teaching point

What this flow proves

Worker and evaluator are different roles, and selective autonomy means the same task can be autonomous for one slice and gated for another.

Six questions

How this flow governs autonomy

Without PAA: Every refund goes through a person (slow and expensive) or through rules that cannot handle edge cases (brittle and risky); there is no principled way to know which decisions are safe to automate.
What gets gated: The final approve-or-deny decision — only clear cases where escalation-classifier confidence is high auto-resolve; ambiguous and high-value cases stay gated indefinitely.
What is logged: Request features, worker decision, evaluator verdict, human decision when escalated, and downstream signals including reversals, chargebacks, and complaints.
Earns promotion: Escalation-classifier recall on should-escalate cases stays above threshold, proving the system reliably identifies hard cases without letting them slip through.
Triggers demotion: Chargeback or complaint rates on auto-approved refunds rise, or feature drift appears in the incoming request distribution.
Never full-auto: High-value and ambiguous refunds — the cost of a wrong decision on these is too high for any learned system to own without a permanent gate.

This page is linked from the canonical card set on the flows index.