● UK · EU — Regulated fintech & energy Certifications delivered: ISO 27001 · PCI DSS v4 · DORA
Written for: CISO CTO Head of Security

Pillar: agentic ai mcp security

Securing decisioning copilots in finance

Decisioning copilots in credit, fraud, and disputes need per-domain autonomy ceilings, not just output filters. Here is the framework I use.

By Giovanni Salvador · · 6 min read

A decisioning copilot that can act autonomously on a fraud alert, a credit application, or a dispute is a different risk category from one that only drafts. Most firms have not drawn that line deliberately.

The conversation about AI in financial services has moved fast from proof of concept to production. Firms are now running copilots that draft credit rationales, triage fraud alerts, score disputes, and assist with Know Your Customer checks. The question that has not kept pace is: how much autonomy is each of these allowed, and what happens at the boundary?

I have spent a lot of time in the last two years working through this with security teams and CTOs across regulated firms. The pattern I keep seeing is the same: the autonomy question was answered by engineering at build time, implicitly, based on what was easiest to wire, rather than by risk and compliance as a deliberate programme choice. That default tends to produce copilots that sit higher on the autonomy spectrum than the business would choose if it thought the question through from first principles.

The stake

The domains that matter most are credit and limit decisions, fraud operations, disputes and chargebacks, and Know Your Customer onboarding. Each of these involves consequential decisions about customers. Each carries regulatory expectations around human oversight, either through GDPR Article 22 protections against purely automated decisions that produce legal or similarly significant effects, or through EU AI Act Article 14 obligations where the system meets the high-risk classification.

The regulatory floor is real. But in my experience, firms often set their autonomy ceiling lower than the floor would require on its own, because the business risk of getting it wrong is more immediate than the regulatory one. A fraud copilot that autonomously blocks a genuine customer creates a complaints problem that lands on a human desk the next morning. A credit copilot that issues an adverse decision without a human in the loop creates a fair-lending exposure that can take months to surface.

The ceiling is therefore the binding constraint in practice. Set it from your own internal risk tiering, calibrated to the blast radius of each domain and the reversibility of each action class. Do not inherit it from the vendor’s default configuration.

The autonomy spectrum, applied per domain

I use a four-level spectrum for this work: suggest, draft for approval, act with guardrails, act autonomously. The useful insight is that a single copilot often legitimately straddles two levels, depending on which action it is reaching for. A fraud copilot that places a soft hold on a card, a broadly reversible protective action, can sit at “act with guardrails.” The same copilot deciding to deny service or release held funds belongs at “draft for approval,” with a human making that call.

Here is how I typically map the six main decisioning domains.

Credit and limit decisions. The default ceiling is draft for approval for anything involving an adverse outcome or a significant limit change. The copilot drafts the rationale and the decision recommendation. A human owns the decision. For low-value, fully reversible limit changes with very strong evidence, act with guardrails is defensible, but the default is a human decision on any outcome that significantly affects the customer.

Fraud operations. Per action, not per domain. Protective and broadly reversible actions such as a step-up challenge, a soft block, or a temporary hold can act with guardrails inside agreed parameters. Anything that denies service, releases held funds, or moves money requires a human. The asymmetry matters: a wrong block harms a genuine customer; a missed fraud loses money. Both errors have a cost, and the autonomy design should reflect that.

Disputes and chargebacks. Act with guardrails for clear, low-value, well-evidenced cases where the evidence points unambiguously one way. Draft for approval as value, ambiguity, or fraud signal rises. A human decides any contested case, any case above a value threshold, and any decision against the customer. Dispute systems are directly monetisable if the autonomy is too high: industrialised first-party fraud against an autonomous decisioner is a known and growing attack vector.

Know Your Customer and onboarding. Draft for approval for standard cases. Suggest only for elevated-risk, politically exposed person, or adverse-media cases, with a human deciding onboarding for all of those. The gate at onboarding protects the entire downstream relationship. Getting it wrong has a remediation cost that far exceeds the cost of the human review.

AML transaction monitoring and sanctions alerting. These are the lowest-ceiling domains. The copilot triages, enriches, and prioritises. It does not close, suppress, or clear alerts autonomously. A human decides every alert disposition. This is a deliberate operating model choice, not just a regulatory one: the consequence of a false negative in this domain is severe, and the firm should not delegate that disposition judgement to a model.

SAR and STR reporting. Draft for approval, never autonomous submission. The copilot assembles and structures the draft. The nominated officer decides whether to file. There is no configuration of this workflow in which an autonomous submission is appropriate.

Where the gate sits, and why the gateway is not the gate

The most common misconception I see is treating the AI gateway as the gate. The gateway is a chokepoint for model egress: it enforces logging, model allow-listing, and output guardrails across every copilot. It is a necessary part of the architecture. But it is not where the human decision sits.

The human-in-the-loop gate sits at the action, inside the workflow, not at the perimeter. A gateway that intercepts every call to a fraud action tool cannot substitute for a human approver reviewing the specific decision about a specific customer’s account. The gateway sees whether a tool was called. The gate is where someone decides whether it should have been.

Segregation of duties matters for the same reason. The agent that detects a fraud signal and the party that authorises releasing held funds should be distinct identities, with no single actor able to both raise and close the same consequential action. This holds for human-human splits and for agent-human splits. The caveat, worth stating clearly: a second LLM agent checking the first LLM agent’s output does not give you independent oversight if both agents are reasoning over the same context. An LLM checker on shared context is vulnerable to the same injection or error that affected the maker. An independent check either uses separately sourced inputs, uses a deterministic policy rule rather than a second LLM judgement, or, for the highest-consequence actions, is a human.

The failure modes worth pre-mortem analysis

Before a decisioning copilot goes live in any domain, I run the failure modes as a pre-mortem: given that this goes wrong in the domain-specific way, does a named control catch it?

For fraud: detection-model evasion, where an attacker learns the signal thresholds and stays just under them, and alert-flooding, where a high volume of low-signal events buries genuine positives. Both require runtime monitoring over sequences of events, not only per-action rules.

For disputes: fabricated evidence. AI-generated receipts and proof of delivery documents are now accessible to anyone who wants to game an automated dispute system. A system that relies on evidence quality without any human review of contested cases is exposed to this at volume.

For AML monitoring: alert suppression, where the copilot systematically down-ranks alerts that should escalate, and detection-model poisoning, where the model’s training or grounding data has been corrupted to create blind spots. Poisoning is particularly dangerous here because its effect is durable and covert: the model behaves normally on ordinary inputs and fails on the specific pattern the attacker shaped it to miss.

For KYC: synthetic identity at onboarding. The gate at the elevated-risk and anomalous-case level is specifically designed to catch this. Anything the copilot flags as unusual goes to a human. That requires the copilot to actually flag it, which requires testing that the flag fires on the patterns you care about.

What to do this week

  1. List every decisioning copilot in production or active development. For each, write down its current autonomy level using the four-step spectrum. If you do not know where it sits, that is itself a finding.

  2. Set an explicit ceiling for each domain it touches. Use the framework above as a starting point, then calibrate down if your internal risk tiering requires it. The ceiling is a business decision, recorded and owned by a named person.

  3. Locate the human-in-the-loop gate. For each copilot, confirm that the gate sits at the action level for irreversible or consequential actions, not only at the perimeter. If the gate is only at the gateway, move it into the workflow.

  4. Run the pre-mortem on the domain-specific failure modes. For each failure mode above that applies to your domain, confirm a named control addresses it. Any unaddressed failure mode is a risk-register entry before it is an incident.

  5. Check your segregation of duties split. For the highest-consequence actions, confirm the checker is independent of the maker in a way that actually produces independent failure modes, not just independent identities over the same context.

The teams that get this right are not the ones with the most capable models. They are the ones that decided, deliberately, how much autonomy each domain gets, placed the gate where it matters, and can point to evidence that the gate fired.

If you're working on this right now — Book a discovery call

Get the monthly briefing

One Friday a month: what's shifting in board-level security, what to do about it, one link worth your time. No spam, no upsell.

We'll use your email only to send the monthly briefing. We won't share with third parties. One-click unsubscribe in every email. See our privacy policy.