For Risk Managers
Risk managers often treat AI-generated risk models as rigorous because they came from a system, without realising the system inherited the blind spots of its training data. Your board reports become dangerously smooth when you summarise AI outputs instead of surfacing where the model breaks down.
These are observations, not criticism. Recognising the pattern is the first step.
SAS generates scenarios based on historical correlations and volatility patterns. You may assume these scenarios are robust because they came from enterprise software, but the assumptions about asset correlations, tail behaviour, and regime shifts are embedded and invisible. When you skip the step of challenging these assumptions with your team, you build board reporting on foundations nobody has examined.
The fix
Before you present any SAS scenario output to the board, run a 30-minute session where your team writes down three assumptions they think the model made, then verify each one against the model documentation.
You paste a complex risk report into ChatGPT and ask it to summarise for the board. The output reads polished and confident. What you have is a language model that has smoothed away the uncertainty, compressed contradictions, and removed the caveats that matter most to your board's decision-making. You have lost the texture of the actual risk.
The fix
Never use ChatGPT summaries in board papers; instead use it only to draft an outline, then write the actual summary yourself by hand, keeping every material uncertainty visible.
These systems excel at monitoring known risk types and detecting deviations from baselines. They are nearly blind to risks that do not resemble anything in their training dataset. Your operational risk may shift because of a regulatory change, a supply chain shock, or a technology shift that has no historical precedent. The system will not alert you. You will think your emerging risk programme is working when it is only watching the rear-view mirror.
The fix
Establish a separate, manual emerging risk process that runs parallel to your AI monitoring; assign one risk manager to spend two hours each month reading horizon-scanning sources and competitor announcements, then challenge the team on what the AI systems might miss.
Azure AI generates stress scenarios quickly by running correlations across thousands of variables. You pick a scenario that looks relevant to your portfolio. You do not know which variables the model allowed to move together and which it held constant. This creates an invisible risk: your scenario may assume correlations that will break precisely when you need them to hold, or vice versa.
The fix
For every Azure scenario you use in a board paper, export the correlation matrix and variance assumptions, then have your head of trading or operations confirm whether those assumptions reflect real market behaviour.
AI risk models perform well on normal market conditions and moderately stressed conditions where data is plentiful. They collapse in tail events, regime shifts, and periods of correlation breakdown. Because you cannot easily see this weakness in the software interface, you may build confidence in the model and reduce your manual stress-testing efforts. Then the tail event arrives and your board discovers your risk limits were based on a system that does not work when it matters most.
The fix
Run your AI model against three historical crises where correlations broke down or volatility spiked unexpectedly, document where the model's outputs diverged from reality, and present these failures to the board alongside your normal results.
Your SAS model produces a Value at Risk figure. You report it to the board as a single number. The board treats it as fact rather than an estimate with a margin of error. You have converted uncertainty into false precision. The board makes decisions on capital allocation or hedging based on a number that might move by 30 per cent if one assumption changes.
The fix
Always present AI-generated risk metrics as a range with explicit lower and upper bounds, and tell the board which single assumption drives the width of that range.
You explain that your new AI-powered monitoring system flags emerging risks in real-time. You do not mention that it only recognises patterns it was trained on, or that it has a six-month lag in regulatory data, or that it cannot detect risks in private companies. Your board believes the risk function is more comprehensive than it actually is. When a blind spot becomes a loss, the board discovers that the tool was always constrained.
The fix
In every board presentation of AI risk output, include a one-paragraph section titled 'What this analysis cannot detect', listing the specific risk categories and conditions outside the system's range.
You use ChatGPT or a similar tool to summarise cross-asset risks. The language model resolves ambiguity and picks a direction. Your actual market view is that rates could rise or fall, equities are vulnerable but supported by earnings, credit spreads are wide but not unsustainable. The AI summary reads: 'Market faces headwind from rate risk.' You report this to the board. You have lost the genuine two-sided nature of your risk position.
The fix
When you ask AI to summarise risk, always ask it to output three scenarios with roughly equal probability, then present all three to the board without picking one.
You build a dashboard in Azure AI or IBM OpenPages that shows all major risks as traffic lights. The board looks at the dashboard, sees no red, and moves on. You have replaced a 30-minute discussion about what could go wrong with a binary status check. The board has less information, not more, because you have hidden all the uncertainty inside a system that does not talk back.
The fix
Use AI dashboards only for monitoring known risks; reserve your board risk agenda for questions that require human judgement: what are we not monitoring, what has changed since last quarter, and which assumptions are we least confident in.
SAS, Azure, or Palantir updates its algorithms or retrains on new data. Your risk metrics shift. If you do not track these changes explicitly, you may report a deterioration in risk that is actually a change in measurement. Your board interprets the shift as a real change in the portfolio or market, when it reflects how the tool now measures risk. You have created a spurious risk signal.
The fix
Subscribe to your AI tool provider's release notes, and when a material update occurs, calculate your key risk metrics using both the old and new methodology, then explain the difference to the board.
Your organisation uses SAS Risk AI for scenario analysis. Your peer banks use SAS. Your counterparties use SAS or similar IBM OpenPages tools. When SAS has a bug, gets a data feed wrong, or uses a flawed assumption, thousands of risk managers across the industry generate correlated mistakes simultaneously. Your diversified risk view is not actually diversified at the system level. A single failure in the AI layer affects the entire financial system.
The fix
Build a manual scenario process that uses a completely different methodology from your AI system; run it quarterly and present results to the board to show where the AI and manual approaches diverge.
Your monitoring staff used to read client communications, regulatory filings, and market commentary to sense problems before they showed up in data. Now they watch the Palantir dashboard. They get an alert when a known risk metric breaches a threshold. They stop reading widely. Your institutional risk radar atrophies. When a novel risk emerges that does not resemble anything the AI was trained on, nobody in your team catches it early.
The fix
Assign one senior risk manager to spend four hours per week reading sources the AI cannot process (earnings call transcripts, regulatory proposal comment periods, industry publications), and require them to report one emerging risk per month to the team.
Your risk dashboards, alerts, and scenario tools are all AI-powered. In a normal month, they work. In a crisis, when volatility spikes, correlations break, and data feeds lag, these systems often fail or produce nonsensical outputs. You have not tested this. You assume that the crisis is exactly when you need your monitoring most. Instead, it is when your tools become unreliable. Your team has no manual backup.
The fix
Run a half-day desktop exercise per year where your team assumes all AI monitoring is offline; work through how you would identify and escalate the top three risks using only manual data sources and human conversation.
You ask ChatGPT or Azure AI to recommend a change to your risk limits. The system produces a coherent argument. You implement it. A year later, you cannot explain to your audit committee or board why the limit is what it is. The AI reasoning is gone. You have made a material decision that nobody can reverse-engineer. If the outcome is poor, you cannot show that the decision followed a sound process.
The fix
Before you implement any AI recommendation for risk policy, have your risk team write down in plain language what the AI suggested, why they agree or disagree, and what human factors the AI did not consider; file this as part of the decision record.
Your SAS model backtests cleanly against the past five years of market data. Your Value at Risk estimates were accurate. You report this to the board as evidence that the model is sound. What you have tested is whether the model explains the past. You have not tested whether it predicts the future or behaves correctly in conditions different from the historical period. You have optimised the model to fit data that will never recur.
The fix
Instead of backtesting, conduct a forward test: set aside the most recent six months of market data, run your model on the first five years only, then measure how well it predicted the holdout period.
Worth remembering
The Book — Out Now
Read the first chapter free.
No spam. Unsubscribe anytime.