40 Questions Government and Public Sector Should Ask Before Trusting AI
When a Microsoft Copilot suggestion or Palantir recommendation shapes a decision about public services, you need to know whether that suggestion is sound and whether you can defend it to citizens and elected officials. Asking the right questions before you act protects both the people affected by your decisions and your own accountability.
These are suggestions. Use the ones that fit your situation.
1If a citizen or MP asks why their benefit application was rejected, can you explain the actual reasons without referencing the AI tool?
2Does your AI tool (Copilot, ChatGPT, GOV.UK AI tools) show you which data inputs it weighted most heavily in its recommendation?
3Have you tested whether the same case given to the AI tool twice produces the same recommendation?
4If the AI recommendation conflicts with your case worker's assessment, which one carries the decision and who is held accountable if it goes wrong?
5Can you name the specific policy rules or precedents the AI is applying, or is it inferring patterns from training data you have not reviewed?
6If this decision affects someone's access to housing, healthcare, or income support, have you documented your reasoning separately from the AI output?
7Does your procurement contract for the AI tool require the vendor to explain how their system works at the level of detail you need for a Freedom of Information request?
8Who in your organisation can articulate in plain English why the AI recommended this action, without reading from a technical manual?
9If a local authority or another government body challenges your decision, can you produce an audit trail showing where the AI's input ended and human judgement began?
10Have you set a threshold for when a decision is too sensitive or consequential to delegate to an AI recommendation, even if the system is accurate?
Bias and Fairness at Population Scale
11Has the AI tool been tested on cases from every demographic group your service actually deals with, including people with disabilities, different ethnic backgrounds, and low digital literacy?
12If the AI was trained on historical data from your own processes, have you checked whether it learned and amplified the biases your staff were already making?
13Does the vendor provide evidence that their system performs equally well for all groups, or only aggregate accuracy figures that hide disparities?
14When you roll out the AI tool across multiple local authorities or services, do you have a way to spot if it performs worse in particular regions or for particular populations?
15If the AI allocates council housing, school places, or social care assessments, have you modelled what the outcomes would look like across different neighbourhoods and protected characteristics?
16Does the AI tool have any way of knowing about context that matters for fairness, such as whether someone has a disability or speaks English as a second language?
17Have you run a sample of cases where the AI recommended one outcome and your best case worker recommended another, and checked which one would have been fairer?
18If the AI tool (such as Palantir) flags certain individuals or neighbourhoods as higher risk, have you validated that this prediction is not simply repeating past policing or enforcement patterns?
19What will you do if, six months after deployment, you discover the tool systematically disadvantages a particular group?
20Does anyone in your team have the authority to override the AI's recommendation without justifying it to senior management?
Expertise and Decision-Making
21Are you using the AI tool to replace analytical work that your experienced staff used to do, or to support decisions they were already making?
22When you deploy an AI tool for policy analysis or impact forecasting, what happens to the civil servants who previously did that work?
23Can a case worker or policy officer still practise their professional judgement, or does the AI tool's output feel like an instruction they are expected to follow?
24Have you measured whether staff morale or trust in their own expertise has changed since the AI tool was introduced?
25If a new policy challenge emerges that the AI tool was not trained on, do you still have the internal expertise to develop a response?
26When ChatGPT or Copilot drafts policy papers or impact assessments, are they being reviewed by someone with real experience in that policy area?
27Does your training for staff using AI tools include explicit permission to reject the AI's recommendation when their professional knowledge suggests it is wrong?
28Have you documented the specific situations where human judgement is essential and the AI tool should only inform the decision, not make it?
29If you stopped using the AI tool tomorrow, could your team still deliver the service at the same quality?
30Are your most experienced staff still making decisions, or have they become approvers of AI recommendations?
Governance and Failure
31If the AI tool makes a recommendation that harms someone (denying them a service wrongly, for example), who is legally liable and will your budget cover the claims?
32Does your organisation have a defined process to identify and investigate when the AI tool produces systematically poor outcomes?
33Have you negotiated a contract term that allows you to audit the AI vendor's training data and model, or does the vendor claim it is proprietary?
34If you discover the AI tool is biased or unsafe, can you stop using it immediately, or are you locked into a multi-year procurement commitment?
35Is someone specifically assigned to monitor whether the AI tool's recommendations are being followed, and whether they are leading to the outcomes you intended?
36Have you established red lines for when AI recommendations should trigger a mandatory human review, such as decisions affecting children or people with mental health needs?
37Does your board or leadership team receive regular reports on the AI tool's performance, accuracy, and any complaints or harms it has caused?
38If the vendor goes out of business or stops supporting their tool, do you retain access to your own data and decision history?
39Have you disclosed to the public that AI is being used in decisions affecting them, and if so, have you received complaints or requests for human review that you are struggling to handle?
40Is there a named person in your organisation responsible for ensuring the AI tool complies with the Civil Service Code and public sector values, or is this responsibility assumed to be nobody's?
How to use these questions
Before you deploy any AI tool across your service, run a small pilot on genuinely representative cases and ask yourself whether the recommendations would survive scrutiny from an external auditor or ombudsman.
Write down your organisation's decision-making rules for a complex case (such as a housing allocation or safeguarding assessment) in plain language. Then ask your AI tool to apply those same rules. Compare the outputs. If they differ, you have found something the AI tool is inferring from data rather than logic.
Assign one senior civil servant to act as a sceptic for the AI tool. Their job is to question outputs, spot patterns in errors, and report regularly to leadership. Make this a formal role with protected time.
When the AI tool recommends something different from what your team member would have chosen, document both recommendations and the actual outcome. Over time, you will see whether the AI is actually better at judging these cases or whether it is simply consistent in a way that feels authoritative.
Talk to your local authority peers who are using the same AI tool. Ask them what went wrong first, not what went right. Most problems become visible within the first six months if anyone is looking for them.