Canonical flow

Refund approval agent

Selective autonomy for clear-cut refund decisions.

Abstract pattern

Selective autonomy classifier

The business-legible flagship showing how a separate escalation classifier earns autonomy only for the easy cases while hard refunds stay human-gated.

  1. Assisted Human-led with automation support.
  2. HITL Human approves each action.
  3. HOTL Human samples or monitors.
  4. Autonomous Automation acts within guardrails.

Task contract

Refund approval agent

The business-legible flagship showing how a separate escalation classifier earns autonomy only for the easy cases while hard refunds stay human-gated.

Worker
Rules, classifier, or LLM depending on the implementation.
Boundary
Input is the refund request, including order details, reason, amount, and customer history. Output is approve, deny, or escalate with rationale.
Evidence log
Request features, worker decision, evaluator verdict, human decision when escalated, and downstream reversal, chargeback, or complaint signals.
Evaluator
A separate escalation classifier decides whether the proposed decision is safe to auto-resolve.
Promotion rule
Escalation-classifier recall on should-escalate cases stays above threshold, which is enough to auto-decide clear cases without trusting the hard ones.
Demotion rule
Chargeback or complaint rates on auto-approved refunds rise, or feature drift appears, which widens escalation and pushes the flow back toward human review.
Fallback
Ambiguous and high-value refunds stay gated indefinitely.
Lives
HITL -> autonomous for easy cases only

Evaluator detail

What the gate actually checks

Target
Output
Technique
Learned escalation classifier trained on accumulated human decisions and tuned for escalation recall.
Oracle
Human approvals and denials provide the labels for training and validation.
Position
hitl

Teaching point

What this flow proves

Worker and evaluator are different roles, and selective autonomy means the same task can be autonomous for one slice and gated for another.

Six questions

How this flow governs autonomy

Without PAA
Every refund goes through a person (slow and expensive) or through rules that cannot handle edge cases (brittle and risky); there is no principled way to know which decisions are safe to automate.
What gets gated
The final approve-or-deny decision — only clear cases where escalation-classifier confidence is high auto-resolve; ambiguous and high-value cases stay gated indefinitely.
What is logged
Request features, worker decision, evaluator verdict, human decision when escalated, and downstream signals including reversals, chargebacks, and complaints.
Earns promotion
Escalation-classifier recall on should-escalate cases stays above threshold, proving the system reliably identifies hard cases without letting them slip through.
Triggers demotion
Chargeback or complaint rates on auto-approved refunds rise, or feature drift appears in the incoming request distribution.
Never full-auto
High-value and ambiguous refunds — the cost of a wrong decision on these is too high for any learned system to own without a permanent gate.

This page is linked from the canonical card set on the flows index.