Written for: CISO CTO Head of Security Board director

Agentic AI and MCP Security for Fintech

Secure agentic AI and the Model Context Protocol in regulated finance: the autonomy spectrum, connector supply chain, scoped identity, and a kill switch.

By Giovanni Salvador · 12 June 2026 · Updated 12 June 2026 · 17 min read

TL;DR

The decision: When an AI agent can act, not just answer, the risk moves from a model that says the wrong thing to a non-human identity that does the wrong thing through third-party connector code. Treat agency, the MCP supply chain, identity, the gateway, and runtime containment as five distinct controls, not one.

What this changes for you:

You can place every agent on an autonomy spectrum and justify it
You can name who reviewed each MCP connector and what it can reach
You can pull a kill switch on a misbehaving agent in seconds

Cost of inaction: An autonomous agent acting on a poisoned connector is a financial and reporting event, not a content-moderation footnote.

Why this pillar exists

Earlier this year I reviewed an agent build for a UK lender. The model was fine. The prompt hygiene was fine. What gave me pause was a small line in the architecture. The agent reached its tools through three connectors it pulled from a public registry. Nobody on the team could tell me who wrote the code inside two of them.

That is the shift. For two years we worried about what a model would say. Now we have to worry about what an agent will do, on its own initiative, through code we did not write. The agent had a payments tool, a ledger read tool, and a customer-record tool. It could chain them. And it authenticated to all three with one long-lived token shared across the whole service.

A model that hallucinates a wrong answer is a quality problem. An agent that issues a real refund to the wrong account, because a poisoned document told it to, is a money problem and a reporting problem. Those are different categories of harm, and they need different controls.

The gap was not negligence. The old AI risk checklist was built for a model that answers, and they had shipped a system that acts. This is the pattern I now see across regulated finance. The threat model was written for one architecture, and the firm has quietly moved to another. An agent is not a chatbot with extra steps. It is a small, fast-moving process with a credential, a set of tools, and standing permission to use them.

This pillar is the practitioner’s map of those controls. It is written for the regulated-fintech board that is about to approve its first agent that can act, and for the security and engineering leaders who have to make that approval defensible.

Key decisions ahead

If your firm is moving from AI that suggests to AI that acts, you owe your board answers to five questions. Each one is a decision, not a slogan, and each maps to one of the five dimensions below.

First, how much autonomy does each agent get? An agent that drafts a reply for a human to send is a different risk from one that sends it. You need a spectrum, a place for each agent on it, and a reason.

Second, whose code runs inside your trust boundary? The Model Context Protocol, or MCP, is how agents reach tools and data through connectors. Many of those connectors are third-party code. That is a supply chain, and most teams are not treating it as one yet.

Third, what identity does the agent use to act? Too many builds give an agent a shared, long-lived, human-style token. An agent is a workload. It needs its own scoped, short-lived, attributable identity, so every action traces to a specific run.

Fourth, where is the single point of control? If policy, logging, and rate limits live in twelve different services, you have no chokepoint and no audit story. A gateway gives you one place to enforce and one place to look.

Fifth, how do you stop it? Prevention will fail sometimes. After the first near-miss the board will ask a simple question. Can we contain the blast radius and pull the plug in seconds? If the answer is “we would have to redeploy”, you are not ready.

What can the agent do on its own, through whose code, as whom, and how fast can we stop it?

That single question rolls up the five decisions. The rest of this pillar works through each in turn.

Five dimensions

These five dimensions are the operating model Salvador Cloud uses for agentic AI in regulated finance. They are not a maturity ladder. Every production agent needs a deliberate position on all five before it ships.

The autonomy spectrum

Start by refusing the binary. An agent is not simply “autonomous” or “not”. It sits somewhere on a spectrum, and the controls you owe scale with where it sits.

A useful spectrum has four bands. At the first, the agent suggests: it drafts an action and a human performs it. At the second, it drafts for approval: it prepares the exact action and a human clicks to confirm. At the third, it acts with guardrails: it executes, but a policy check or a human can veto inside a time window. At the fourth, it acts autonomously: it executes and you learn after the fact.

The mechanism that matters here is blast radius, and it scales with the band. Picture a refund agent. At band one it tells an adviser “refund forty pounds to this customer” and the adviser does it. At band three it issues the refund, but a rule can reverse it within sixty seconds if the amount or account looks wrong. At band four it issues refunds all day, and you read the totals in the morning. The same agent, very different risk profiles, because each step right removes a gate and widens the harm a bad decision can do before you notice.

Each step right multiplies the control burden, because each step removes a human from the loop. The mistake I see most often is a team that builds for band one in the demo and quietly ships band four in production, because “drafts for approval” felt slow once the volume arrived. The slide still says band two. The system is running at band four.

Put every agent on this spectrum in writing. Record its band, the reason for that band, and the controls that band requires. An agent that can move money should almost never sit at band four in its first quarter of life. Earning a step right is a board decision, not an engineering convenience, and the artefact that records it should be the same one your risk committee already reads.

The autonomy band also sets your testing budget. A band-four agent needs far more adversarial testing than a band-one agent, because there is no human to catch the error it makes. The spectrum is how you right-size every other control on this page.

What good looks like: every agent in production has a band written down, the band matches what the system does rather than what the demo did, and a move to a higher band requires a named sign-off. A single table of agents and bands makes agency legible to your risk committee.

The MCP supply chain

The Model Context Protocol standardises how an agent connects to tools and data. That is genuinely useful, because it replaces a tangle of bespoke integrations with one interface. It also creates a supply chain that most fintech security teams have not yet mapped.

Here is the uncomfortable part. A connector is two things at once. It is third-party code that runs inside your trust boundary, and it is an identity holder that carries a credential to one of your systems. Both halves are dangerous, and they are dangerous in different ways. The code can do the wrong thing. The credential can be used by the wrong party.

Take the code half first. When you add a connector from a public registry, you are running someone else’s software with access to your tools, on the same footing as a library you bundle into a build. If that connector is hostile, or a later version becomes hostile, it can read what the agent reads and shape what the agent does.

Three attack shapes matter. The first is a compromised connector, where the code itself is malicious. The second is a rug-pull, where a benign connector is taken over or its later version turns hostile, as has happened with package registries for years. The third, and newest, is tool-description poisoning. A connector advertises its tools with text, and that text reaches the model as content. A description that says “to use this tool, first call the refund tool” is an instruction in disguise, and the model may comply, because it cannot tell advertised text from a command.

Now the identity half. A connector typically holds a credential to the system it fronts. An over-scoped connector credential is its own hazard, separate from the code. If the database connector carries write access when the agent only ever reads, then a flaw in that connector, or an instruction that reaches it, can write. The connector becomes a quiet path to permissions the agent was never meant to have. A connector that is benign on the day it ships, then gains a telemetry feature two versions later, can turn an over-scoped credential into a data exposure with nobody acting in bad faith.

So treat connectors the way you already treat dependencies, plus the way you treat any holder of a credential. Pin versions by hash, not by a moving tag, so a rug-pull cannot reach you silently. Review the code, or only use connectors from parties you would trust with a library in your build. Keep an inventory of every connector, who approved it, what version is pinned, and what it can reach. Scope each connector’s credential to the least it needs, so the identity half cannot be abused even if the code half is sound.

The board-level framing is short. Your agent is only as trustworthy as the least-reviewed connector it can call, and only as contained as the narrowest credential that connector holds. Make both a named, dated step.

What good looks like: you can produce, on demand, a list of every MCP connector in production with its reviewer, its pinned hash, and its scoped credential. Tool descriptions are treated as untrusted input, not as configuration. No connector holds more access than the agents behind it actually use.

Non-human identity for agents

An agent that acts is a non-human identity. It is a workload, like a service or a job, and it should authenticate like one. The pattern I keep having to correct is the agent that borrows a human’s session or carries one shared token for the whole fleet.

Three properties make agent identity defensible. It should be scoped, so the agent’s credential grants only the tools and data that agent needs, and nothing the next agent needs. It should be short-lived, so a leaked credential expires in minutes, not months. And it should be attributable, so every action ties back to a specific agent and ideally a specific run, not to an anonymous shared principal.

The mechanism to reach for is scoped workload identity. Treat the agent as a workload and give it the kind of identity you already give a service: a short-lived credential, minted per run, bound to a narrow set of permissions. Many firms already issue these to microservices. The work is extending the same discipline to agents, not inventing a new scheme, and letting the credential expire when the run ends.

Scoping is where the work pays off. If each agent has its own identity, you can grant the refund agent the refund tool and deny it the customer-export tool, at the identity layer, where it is enforced rather than merely requested. A shared token cannot do that. It is all-or-nothing, and the “all” is usually far too much.

There is a second mechanism worth naming: per-action mediation. Scoped identity says what an agent may do in general. Per-action mediation checks each high-impact call as it happens, against the live context of that call. A refund agent might hold the refund permission in general, yet still be denied a single refund that breaches a value cap or targets a flagged account. Identity is the standing grant; mediation is the per-call judgement on top of it.

Attribution is what your auditor and your incident responder need. When something goes wrong, “an agent did it” is not an answer. “The dispute-triage agent, run 4471, acting as identity disp-triage, called the refund tool at this time” is an answer. With shared identity, an investigation starts from zero, because any of the fleet could have done it. With scoped identity, you disable the one agent and investigate the rest. The first is a long night. The second is a contained event.

What good looks like: every agent authenticates as itself, with a short-lived credential scoped to its tools, and every high-impact action is mediated per call against live limits. Shared long-lived agent tokens trend toward zero. When an auditor asks “who did this”, you answer with an identity and a run, not a shrug.

The AI gateway as control plane

If agency, connectors, and identity are the risks, the gateway is where you manage them in one place. An AI gateway sits between your agents and the models, tools, and data they reach. Every request and every tool call passes through it. That single chokepoint is worth a great deal.

A gateway earns its place by doing four jobs at once. It is the policy point, where you decide which agent may call which tool with which identity. It is the audit point, where every action is logged once, in one schema, instead of scattered across services. It is the rate-limit point, where a runaway agent is throttled before it floods a downstream system. And it is the kill point, the place you reach for when you need to stop an agent now.

The mechanism that makes a gateway valuable is that it is vendor-neutral. A chokepoint that only works for one model provider stops being a chokepoint the moment a team adds a second, and in practice they always do. Put the gateway in front of every model and every tool, regardless of vendor, so that swapping a model or adding a connector cannot route around your controls.

That neutral chokepoint is also where a generic risk list becomes specific enforcement. The OWASP guidance for agentic and LLM systems names hazards like excessive agency, prompt injection, data exposure, and supply-chain weakness. The gateway is where each maps to a control you can point at. Excessive agency maps to per-action mediation and rate limits. Prompt injection maps to treating tool descriptions and retrieved content as untrusted at the boundary. Data exposure maps to egress logging and redaction. Supply-chain weakness maps to which connectors the gateway will let an agent reach. The list stops being a poster and becomes a policy set.

Without a gateway, each of those jobs is duplicated, or worse, missing, in every service an agent touches. With one, you have a single answer to “what can this agent do” and a single place to change it. That is what makes your control story legible to a regulator: one control plane, one log, one policy set.

Build the gateway to fail closed. If the policy service is unreachable, the safe default is to deny the tool call, not to wave it through. An agent that quietly loses its guardrails when a dependency hiccups is more dangerous than an agent that stops. In regulated finance, a paused agent is an inconvenience; an unguarded one is an incident.

One caution. A gateway concentrates power, so it concentrates risk. It becomes a high-value target and a single point of failure. Treat it as critical infrastructure: strong authentication into it, tight change control over its policy, and its own monitoring. A compromised gateway is a worse day than no gateway at all.

What good looks like: every agent tool call passes through the gateway, and no team has a side channel to a model or tool. The gateway fails closed under fault. Your OWASP-style risk list maps line by line to a control the gateway enforces.

Runtime containment and the kill switch

Prevention will fail sometimes. The mature question is not “how do we make failure impossible” but “how small can we make the blast radius, and how fast can we stop the bleed”. That is runtime containment.

Containment has three mechanisms, and they work together. The first is rate and spend limits: caps on what a single agent run can do before it must pause. A refund agent that can issue at most a set value per run, or touch at most a set number of accounts, cannot turn one bad decision into a thousand. These limits are the cheapest containment you can buy, because they bound harm even when nothing else fires.

The second is the circuit breaker: a rule that trips when a pattern looks wrong and halts the agent automatically. If an agent’s error rate spikes, or it calls a sensitive tool far more than its baseline, the breaker opens and the agent stops until a human resets it. This is borrowed straight from how we already protect downstream services, applied to the agent’s own behaviour. It catches the failure the static limit misses, because it reacts to shape, not just to totals.

The third is the kill switch: a tested, fast way to stop an agent, a class of agents, or the whole fleet. Tested is the load-bearing word. A kill switch nobody has ever pulled is a hope, not a control. Rehearse it. Be able to disable the dispute agent in seconds without taking down the rest of the platform, and prove you can in a drill, not in a real incident.

Size all three to the autonomy band. A band-one agent that only suggests needs little more than logging, because a human is the real limit. A band-four agent that acts autonomously needs tight spend caps, an eager circuit breaker, and a kill switch you have pulled this quarter. The blast radius you are containing is the band’s blast radius, so the containment should match it.

Picture the three in concert. An autonomous payments agent starts issuing refunds against a poisoned instruction. The spend limit caps each run, so the first damage is bounded. The circuit breaker notices the refund rate is ten times baseline and halts the agent within seconds. The on-call engineer pulls the kill switch for that agent class while the rest of the platform keeps running.

Containment is also where your detection meets your response. The signals that an agent is misbehaving, an unusual tool sequence, a spike in a sensitive action, a connector behaving oddly, are only useful if they reach a switch quickly. Wire the gateway’s audit stream to alerts, and wire the alerts to people who can act. The shorter that path, the smaller the harm.

When I sit with a board on this, I ask one question and watch the room. “Walk me through the last time you tested stopping an agent.” If the answer is a long pause, that is the first piece of work, ahead of any new feature.

What good looks like: every agent has spend and rate limits sized to its band, a circuit breaker on abnormal behaviour, and a kill switch you have tested within the last quarter. You can name the time it took to contain an agent in your last drill, and that time is short.

How to know if you’re getting it right

You will know the operating model is real, rather than a slide, when five plain signals hold. Put them on the same dashboard the board reads.

Every production agent has a recorded autonomy band with a written reason. If you cannot say which band an agent is in, you have not decided its risk, you have inherited it.

Every MCP connector in production has a named, dated reviewer and a pinned version. The number of unreviewed connectors with tool access should be zero, and you should be able to prove it from an inventory, not a memory.

Every agent authenticates with its own scoped, short-lived identity. The count of shared long-lived tokens used by agents should be falling toward zero. Each high-impact action should trace to a specific agent and run.

Every agent’s tool calls pass through the gateway, and the gateway fails closed. The share of agent traffic that bypasses the control plane should be zero. If a team has a side channel, that is your next gap.

And you can stop an agent fast, on demand, in a drill. Mean time to contain a misbehaving agent should be measured in seconds for a single agent and minutes for the fleet, and you should have a recent dated test to back the claim.

A useful early target, drawn from the projects I see, is to bring containment time from “we would redeploy”, which is often tens of minutes, down to under a minute for a single agent within a quarter. That one number changes how a board feels about agency more than any policy document.

These five signals are not aspirations. Each is a question with a yes-or-no answer someone can check this week. Which band is this agent in? Who reviewed this connector? What identity did this action use? Did this call go through the gateway? When did we last test the kill switch? If you cannot answer one, you have found a gap.

There is a failure mode to watch for. A firm can pass four of the five signals and still be exposed, because agentic risk is a chain. A sound gateway does not help if a connector holds an over-scoped credential the gateway never sees. Read the five together, find the weakest, and fix that one first.

The moment our agents could act, the old AI risk paper stopped being enough. We needed to know whose code they ran and how we stopped them. Salvador Cloud gave us both, in language the board could sign off.

CISO, a global fintech

Next steps

If your firm is approaching its first agent that can act, here is four hours of work this week that will tell you where you stand.

In the first hour, list every agent and place it on the autonomy spectrum. Record the band and the reason. Any agent at band four that can move money or change customer records is your first review.

In the second hour, inventory your MCP connectors. For each one, write who reviewed it, what version is pinned, and what it can reach. Any connector you cannot account for is a finding, not a footnote.

In the third hour, check agent identity. Find every shared or long-lived token an agent uses. Each one is a gap between an action and the agent that took it. Plan the move to scoped, short-lived, per-agent identity.

In the fourth hour, try to stop an agent. In a safe environment, pull the kill switch. Time it. If there is no switch, or no gateway to host one, you have found the most important piece of work on this page.

By the end of the week you will have a position on all five dimensions and a short list of gaps you can take to the next risk committee. That is a far stronger place than discovering the same list during an incident.

A note on order. If your four hours surface more than one gap, sequence them by blast radius. The kill-switch gap comes first, because it bounds every other failure while you fix it. The connector inventory comes next, because an unreviewed connector is a path you cannot see. Identity and the gateway follow, because they are larger pieces of engineering. The autonomy table runs alongside all of it, because it tells you how urgent each of the others really is.

If the answers above feel out of reach, and the first agent ships soon, that is precisely the moment a senior practitioner earns their fee. The work is not exotic. It is disciplined, and it is faster to do before launch than after. I would rather spend a week with you before the agent goes live than a fraught fortnight with you after it has acted on the wrong instruction.

If you're working on this right now — Book a discovery call