For Insurance

The Most Common AI Mistakes Insurers Make

Insurers often hand over underwriting and claims decisions to AI models because the outputs seem objective and fast. This speed comes at the cost of the contextual judgement that prevents actuarially correct decisions from becoming unfair ones.

These are observations, not criticism. Recognising the pattern is the first step.

Download printable PDF

Underwriting Judgement Under AI

Underwriters trained over decades to spot unusual case patterns now defer to model scores without reading the full application. The model optimises for historical loss data, not for the edge cases where human judgement prevents bad faith claims or catches fraud that statistics alone would miss.

The fix

Require underwriters to document their reasoning when they override a model score, then review those overrides quarterly to see which human judgements prevented losses the model would have approved.

These tools output risk scores and feature importance charts, but cannot explain the actual logic that led to a decline. When you cannot tell the applicant why they were rejected, you expose yourself to regulatory challenge and you lose the chance to correct bad input data.

The fix

After a decline, have a human underwriter state in writing the specific risk factor that justified the rejection, separate from the model score.

Underwriters use ChatGPT to summarise application data or generate file notes, but the model invents details and conflates applicant statements with assumptions. These notes then appear in regulatory files as if they document actual underwriting decisions.

The fix

Ban ChatGPT from any file documentation or decision notes, and limit it to internal brainstorming only, with mandatory human fact check before any output goes near a case file.

If your training data reflects past underwriting that was biased by geography, age, or background, the model will scale that bias across thousands of decisions. You inherit not just the bias but the legal liability that comes with automation at scale.

The fix

Before deploying any model for underwriting, have compliance audit the training data for known historical bias by demographic group, and flag any pattern that would fail a disparate impact test.

A model can be statistically accurate at predicting losses within a subgroup while still charging unfairly high premiums because the subgroup has fewer competing options. Accuracy to the data is not the same as fairness to the person.

The fix

For any model used in pricing or underwriting, require a separate fairness audit that compares approval rates and premium levels across demographic groups, with documented justification for any difference above 10 percent.

Claims Decision Speed Without Human Discretion

IBM Watson and SAS fraud detection tools are trained to flag patterns, not to judge intent or context. A claim that fits a fraud pattern might be legitimate given the claimant's actual circumstances, but the auto-denial goes out before anyone reads the file.

The fix

Set your fraud model to route medium-confidence flags to a claims examiner for review before any denial letter is sent, and track how often the human decision differs from the model.

The model bases reserves on historical claims similar to the current one, but it cannot account for novel injury patterns, emerging litigation trends, or applicant circumstances that justify higher reserve. Underreserve and you hide losses, overreserve and you waste capital without reason.

The fix

Have a reserving actuary review and sign off on any reserve that deviates more than 25 percent from the model recommendation, with written explanation.

Azure AI or similar tools can identify policy exclusions and auto-deny claims that technically fall outside coverage. But if the exclusion is ambiguous or was not clearly disclosed at sale, the automation locks you into indefensible positions.

The fix

For any claim denial recommended by AI, have a claims counsel review whether the policy language would survive a coverage dispute before the denial is issued.

The language is smooth and sounds authoritative, but it often misses required regulatory language, omits appeal rights, or misstates the actual reason for the decision. Claimants then contest the decision based on the letter's own contradictions.

The fix

Use a template letter that embeds all required disclosures and appeal instructions, and only have human claims staff fill in the specific factual findings and decision.

Insurers often measure claims AI by how quickly it closes files or denies claims. But fast closure of a wrongly denied claim costs more in litigation and reputation than slower, accurate decisions would have cost.

The fix

Track claims AI success by accuracy of initial decision, cost per claim including litigation costs and appeals, and customer satisfaction, not by speed of closure.

Regulatory and Fairness Exposure

SAS, Guidewire, and Azure all offer explainability tools that show which features the model weighted highest. But regulators require that you explain not just what the model looked at, but why that feature is legitimate for underwriting, and whether it proxies for a protected class.

The fix

For each feature your model uses in underwriting or pricing, document in writing how it relates to actual loss experience and confirm with compliance that it does not proxy for protected status.

When you retire manual underwriting in favour of a model, you lose the institutional memory of why certain applications were handled differently. Years later, when the model produces unfair results, you cannot show that the old human process would have done better or that you thought about the change.

The fix

Before deploying any model, document the underwriting guidelines, exception rules, and judgment criteria that it will replace, and keep that documentation in your file.

Your Guidewire or Watson model was trained on data from 2021. Since then, the customer base has changed, economic conditions have shifted, and the data flowing into the model now is different. The model still outputs scores as if nothing changed, and you are now using stale assumptions to make current decisions.

The fix

Require a formal model audit every 12 months that compares current claims experience against the model's predictions by cohort, and flag any drift above 5 percent as cause for retraining.

As AI models handle more decisions, fewer underwriters are hired or trained to do manual review. When the model fails or you need to make judgment calls on complex cases, you no longer have the expertise in house to do it fairly.

The fix

Assign at least one senior underwriter to review and mentor on all AI-overridden cases, and rotate junior staff through manual underwriting decisions even if the AI handles most cases.

You license SAS AI or Microsoft Azure AI and assume the vendor bears responsibility if the model is biased or unfair. But your regulators hold you accountable for every underwriting decision your organisation makes, regardless of who built the tool.

The fix

Add a contract clause requiring your AI vendor to provide detailed documentation of training data, model architecture, and bias testing, and conduct your own independent fairness audit before and after deployment.

Worth remembering

Related reads

The Book — Out Now

Cognitive Sovereignty: How To Think For Yourself When AI Thinks For You

Read the first chapter free.

No spam. Unsubscribe anytime.