Canonical flow
Refund approval agent
Selective autonomy for clear-cut refund decisions.
Abstract pattern
Selective autonomy classifier
The business-legible flagship showing how a separate escalation classifier earns autonomy only for the easy cases while hard refunds stay human-gated.
- Assisted Human-led with automation support.
- HITL Human approves each action.
- HOTL Human samples or monitors.
- Autonomous Automation acts within guardrails.
Task contract
Refund approval agent
The business-legible flagship showing how a separate escalation classifier earns autonomy only for the easy cases while hard refunds stay human-gated.
- Worker
- Rules, classifier, or LLM depending on the implementation.
- Boundary
- Input is the refund request, including order details, reason, amount, and customer history. Output is approve, deny, or escalate with rationale.
- Evidence log
- Request features, worker decision, evaluator verdict, human decision when escalated, and downstream reversal, chargeback, or complaint signals.
- Evaluator
- A separate escalation classifier decides whether the proposed decision is safe to auto-resolve.
- Promotion rule
- Escalation-classifier recall on should-escalate cases stays above threshold, which is enough to auto-decide clear cases without trusting the hard ones.
- Demotion rule
- Chargeback or complaint rates on auto-approved refunds rise, or feature drift appears, which widens escalation and pushes the flow back toward human review.
- Fallback
- Ambiguous and high-value refunds stay gated indefinitely.
- Lives
- HITL -> autonomous for easy cases only
Evaluator detail
What the gate actually checks
- Target
- Output
- Technique
- Learned escalation classifier trained on accumulated human decisions and tuned for escalation recall.
- Oracle
- Human approvals and denials provide the labels for training and validation.
- Position
- hitl
Teaching point
What this flow proves
Worker and evaluator are different roles, and selective autonomy means the same task can be autonomous for one slice and gated for another.
Six questions
How this flow governs autonomy
- Without PAA
- Every refund goes through a person (slow and expensive) or through rules that cannot handle edge cases (brittle and risky); there is no principled way to know which decisions are safe to automate.
- What gets gated
- The final approve-or-deny decision — only clear cases where escalation-classifier confidence is high auto-resolve; ambiguous and high-value cases stay gated indefinitely.
- What is logged
- Request features, worker decision, evaluator verdict, human decision when escalated, and downstream signals including reversals, chargebacks, and complaints.
- Earns promotion
- Escalation-classifier recall on should-escalate cases stays above threshold, proving the system reliably identifies hard cases without letting them slip through.
- Triggers demotion
- Chargeback or complaint rates on auto-approved refunds rise, or feature drift appears in the incoming request distribution.
- Never full-auto
- High-value and ambiguous refunds — the cost of a wrong decision on these is too high for any learned system to own without a permanent gate.
This page is linked from the canonical card set on the flows index.