Pillar: ai security guardrails for fintech
Quantifying AI risk for the board
How to move from heat maps to money-denominated loss distributions when quantifying AI risk for boards and CROs, with autonomy as the key magnitude multiplier.
You can threat-model an agentic AI estate perfectly and still lose the argument in the room that matters. Boards do not govern against colours on a heat map.
There is a pattern I see repeatedly in regulated firms that have done serious work on AI security. The threat model is solid. The abuse cases are identified. The controls are in place. And then a board asks: “How much risk does our AI programme carry?” The answer comes back as “medium to high” with a traffic-light diagram. The board cannot govern that. The finance director cannot allocate a control budget against it. The CRO cannot aggregate it into the firm’s risk appetite statement alongside every other material risk the firm runs.
Qualitative heat maps fail you precisely where AI risk is hardest. They cannot compare an injection-driven erroneous payment to a copilot data leak. They cannot show that adding autonomy to a workflow moves the risk exposure. They cannot be aggregated into a portfolio view a board can act on. The answer is not a better heat map. It is a quantification method that produces a distribution of probable loss, in money, with stated assumptions and honest uncertainty bands.
The stake
The issue is not that AI risk is impossible to quantify. It is that the standard threat-modelling vocabulary does not translate into the language boards use to make decisions. A board governs against risk expressed as probable financial impact. Every other material risk in a regulated firm arrives in those terms: credit losses, operational losses, regulatory fines as ranges. AI risk should arrive the same way.
The method I use, and that I recommend, is FAIR (Factor Analysis of Information Risk). FAIR decomposes risk into two components: loss event frequency and loss magnitude. Each is estimated as a calibrated range rather than a point value. You combine them through a Monte Carlo simulation to produce a loss-exceedance curve. The result is a sentence a board can govern against: “We are 90% confident annualised loss from this case is between £X and £Y, with a 10% chance of exceeding £Z.”
Every assumption behind that sentence is visible. A reviewer who disputes one input can change it and watch the curve move. That falsifiability is the whole point. A heat-map colour offers nothing to argue with.
Decomposing an AI abuse case
Take a concrete example: an indirect injection attack on an ops agent with payment authority. The agent ingests documents from external sources. A malicious document steers it into initiating an unauthorised payment.
Decompose loss event frequency into two factors.
Threat event frequency is how often the agent encounters a malicious injection attempt in a payment-capable context. This is a function of how much untrusted content the agent ingests and how attractive the payment path is to an attacker. Your fraud operations team and your threat modelling group can give you a calibrated 90% range: they know ingestion volumes, they know how many payment-capable workflows exist. Build frequency up from those inputs rather than guessing at an overall rate.
Vulnerability is the proportion of those attempts that would actually produce an unauthorised action, given the controls in place. Guardrails that catch injection attempts, least-privilege scoping that limits the agent’s payment authority, and human-in-the-loop gates that intercept suspicious requests all move this factor downward.
Decompose loss magnitude into two further factors.
Primary loss is the value movable in a single successful event. This is bounded directly by the agent’s per-transaction authority and its daily aggregate cap. These are engineering parameters you set. They are also the most direct lever you have on tail loss.
Secondary loss covers remediation costs, customer redress, investigation effort, and regulatory consequences. Size these as ranges, not point values. Do not attach specific regulatory penalty figures to the secondary component: penalty quanta are case-specific and change frequently. Carry the regulatory element as a flagged range with an explicit assumption recorded behind it.
Run a Monte Carlo simulation across those four ranges and you get a loss distribution. You can read off expected annualised loss and, more usefully for a board, the tail: the loss you have a 1-in-20 chance of exceeding in any given year. That tail figure is what matters in a regulated firm. A low-frequency, high-magnitude agentic failure can breach an operational impact tolerance even with a small expected annual loss.
Autonomy as a magnitude multiplier
Here is the feature that makes AI risk quantification different from standard operational risk. Autonomy level is a multiplier on the loss distribution, and it acts in a specific and quantifiable way.
On the human-gated half of the autonomy spectrum, a human intercepts attempted actions before they have effect. That interception reduces loss event frequency: most attempts never become loss events at all. A “suggest-only” copilot and a “draft-for-approval” agent carry the autonomy multiplier primarily in the frequency factor.
On the agent-executes half of the spectrum, the event occurs. The only question is how large it is. An agent that acts with guardrails or acts autonomously has no per-action human interception. The multiplier now sits in the magnitude factor, bounded only by the aggregate limits and circuit breakers you have applied.
The practical consequence: moving a workflow from draft-for-approval to act-with-guardrails can swing the loss distribution by an order of magnitude with no change in attacker behaviour. That makes the autonomy decision itself a board-level risk decision, not an engineering convenience. It should arrive at the board with a before-and-after loss distribution, not a feature-specification slide.
It also tells you where a human-in-the-loop gate buys the most risk reduction. The gate is worth most exactly where the multiplier is steepest. That is rarely uniform across your AI estate.
Estimating frequency honestly when data is thin
The most common objection I hear to quantifying AI risk is that we lack actuarial frequency data for agentic attacks. They are new. The public incident record is thin and not FS-specific. That is true. It is not an argument for a heat map. It is an argument for calibrated estimation with honest uncertainty.
Four moves keep the estimation defensible.
First, use calibrated expert ranges, not invented precision. Have your threat-modelling and fraud operations people give 90% confidence ranges, record who estimated what and why, and publish that provenance alongside the output. A wide, defensible range beats a narrow, fabricated one every time.
Second, decompose to where you do have data. You may not know “injection success rate” as a single number. But you often know ingestion volume, the number of payment-capable agent workflows, your control coverage, and your red-team results from build-time evaluation. Build frequency up from those components.
Third, use adjacent base rates, flagged as analogues. Insider-misuse base rates and business-email-compromise rates are usable anchors if you state explicitly that they are analogues, not measurements of the specific AI risk. The flagging matters: a board can assess an analogy; it cannot assess an unlabelled assumption.
Fourth, make uncertainty a first-class output. The width of the loss distribution is information. A wide band on a material case tells the board the firm cannot yet see the risk clearly. That is an argument for better telemetry and for caution on autonomy increases, not an argument for inaction or for a heat map.
Ranking control spend
Quantification earns its keep when it ranks where to spend on controls. Two outputs do that work.
The first is a risk ranking by expected and tail loss. Order your AI abuse cases by annualised expected loss and by tail (the 1-in-20 figure). Tail loss matters in a regulated firm because impact tolerances are set against low-frequency, high-magnitude events.
The second is control return on investment as risk reduction per pound. Re-run the simulation with a proposed control applied. A human-in-the-loop gate reduces vulnerability (a frequency effect). Tighter least-privilege scoping also reduces vulnerability. Aggregate transaction limits cap the magnitude factor directly. Report the shift in the loss distribution against the control’s cost. This is how you defend deprioritising a control as much as how you fund one. A control that barely moves the curve is a documented, defensible “not now.”
Keep the method proportionate. Quantify the cases that are plausibly material and the control decisions that are genuinely contested. A one-line qualitative rationale is fine for the rest. The goal is better decisions, not a quantification industry.
What to do this week
- Identify your two or three most material AI abuse cases: the ones where a successful attack would breach an operational impact tolerance.
- For each, write down the four FAIR components: threat event frequency, vulnerability, primary loss, and secondary loss. Give each a 90% confidence range, not a point value.
- Run even a simple spreadsheet simulation across those ranges. The shape of the output will tell you whether the expected loss or the tail is the governing concern.
- Model the autonomy multiplier explicitly: run the same case at draft-for-approval and at act-with-guardrails and compare the tail loss figures.
- Bring the two or three cases to your next board or risk committee meeting as loss distributions, not traffic lights.
If you're working on this right now — Book a discovery call