For Government and Public Sector

40 Questions Government and Public Sector Should Ask Before Trusting AI

Q: If a citizen or MP asks why their benefit application was rejected, can you explain the actual reasons without referencing the AI tool?

A key question for Government and Public Sectors to ask when reviewing AI outputs.

Q: Does your AI tool (Copilot, ChatGPT, GOV.UK AI tools) show you which data inputs it weighted most heavily in its recommendation?

A key question for Government and Public Sectors to ask when reviewing AI outputs.

Q: Have you tested whether the same case given to the AI tool twice produces the same recommendation?

A key question for Government and Public Sectors to ask when reviewing AI outputs.

Q: If the AI recommendation conflicts with your case worker's assessment, which one carries the decision and who is held accountable if it goes wrong?

A key question for Government and Public Sectors to ask when reviewing AI outputs.

Q: Can you name the specific policy rules or precedents the AI is applying, or is it inferring patterns from training data you have not reviewed?

A key question for Government and Public Sectors to ask when reviewing AI outputs.

When a Microsoft Copilot suggestion or Palantir recommendation shapes a decision about public services, you need to know whether that suggestion is sound and whether you can defend it to citizens and elected officials. Asking the right questions before you act protects both the people affected by your decisions and your own accountability.

These are suggestions. Use the ones that fit your situation.

Download printable PDF

Accountability and Transparency

1 If a citizen or MP asks why their benefit application was rejected, can you explain the actual reasons without referencing the AI tool?

2 Does your AI tool (Copilot, ChatGPT, GOV.UK AI tools) show you which data inputs it weighted most heavily in its recommendation?

3 Have you tested whether the same case given to the AI tool twice produces the same recommendation?

4 If the AI recommendation conflicts with your case worker's assessment, which one carries the decision and who is held accountable if it goes wrong?

5 Can you name the specific policy rules or precedents the AI is applying, or is it inferring patterns from training data you have not reviewed?

6 If this decision affects someone's access to housing, healthcare, or income support, have you documented your reasoning separately from the AI output?

7 Does your procurement contract for the AI tool require the vendor to explain how their system works at the level of detail you need for a Freedom of Information request?

8 Who in your organisation can articulate in plain English why the AI recommended this action, without reading from a technical manual?

9 If a local authority or another government body challenges your decision, can you produce an audit trail showing where the AI's input ended and human judgement began?

10 Have you set a threshold for when a decision is too sensitive or consequential to delegate to an AI recommendation, even if the system is accurate?

Bias and Fairness at Population Scale

11 Has the AI tool been tested on cases from every demographic group your service actually deals with, including people with disabilities, different ethnic backgrounds, and low digital literacy?

12 If the AI was trained on historical data from your own processes, have you checked whether it learned and amplified the biases your staff were already making?

13 Does the vendor provide evidence that their system performs equally well for all groups, or only aggregate accuracy figures that hide disparities?

14 When you roll out the AI tool across multiple local authorities or services, do you have a way to spot if it performs worse in particular regions or for particular populations?

15 If the AI allocates council housing, school places, or social care assessments, have you modelled what the outcomes would look like across different neighbourhoods and protected characteristics?

16 Does the AI tool have any way of knowing about context that matters for fairness, such as whether someone has a disability or speaks English as a second language?

17 Have you run a sample of cases where the AI recommended one outcome and your best case worker recommended another, and checked which one would have been fairer?

18 If the AI tool (such as Palantir) flags certain individuals or neighbourhoods as higher risk, have you validated that this prediction is not simply repeating past policing or enforcement patterns?

19 What will you do if, six months after deployment, you discover the tool systematically disadvantages a particular group?

20 Does anyone in your team have the authority to override the AI's recommendation without justifying it to senior management?

Expertise and Decision-Making

21 Are you using the AI tool to replace analytical work that your experienced staff used to do, or to support decisions they were already making?

22 When you deploy an AI tool for policy analysis or impact forecasting, what happens to the civil servants who previously did that work?

23 Can a case worker or policy officer still practise their professional judgement, or does the AI tool's output feel like an instruction they are expected to follow?

24 Have you measured whether staff morale or trust in their own expertise has changed since the AI tool was introduced?

25 If a new policy challenge emerges that the AI tool was not trained on, do you still have the internal expertise to develop a response?

26 When ChatGPT or Copilot drafts policy papers or impact assessments, are they being reviewed by someone with real experience in that policy area?

27 Does your training for staff using AI tools include explicit permission to reject the AI's recommendation when their professional knowledge suggests it is wrong?

28 Have you documented the specific situations where human judgement is essential and the AI tool should only inform the decision, not make it?

29 If you stopped using the AI tool tomorrow, could your team still deliver the service at the same quality?

30 Are your most experienced staff still making decisions, or have they become approvers of AI recommendations?

Governance and Failure

31 If the AI tool makes a recommendation that harms someone (denying them a service wrongly, for example), who is legally liable and will your budget cover the claims?

32 Does your organisation have a defined process to identify and investigate when the AI tool produces systematically poor outcomes?

33 Have you negotiated a contract term that allows you to audit the AI vendor's training data and model, or does the vendor claim it is proprietary?

34 If you discover the AI tool is biased or unsafe, can you stop using it immediately, or are you locked into a multi-year procurement commitment?

35 Is someone specifically assigned to monitor whether the AI tool's recommendations are being followed, and whether they are leading to the outcomes you intended?

36 Have you established red lines for when AI recommendations should trigger a mandatory human review, such as decisions affecting children or people with mental health needs?

37 Does your board or leadership team receive regular reports on the AI tool's performance, accuracy, and any complaints or harms it has caused?

38 If the vendor goes out of business or stops supporting their tool, do you retain access to your own data and decision history?

39 Have you disclosed to the public that AI is being used in decisions affecting them, and if so, have you received complaints or requests for human review that you are struggling to handle?

40 Is there a named person in your organisation responsible for ensuring the AI tool complies with the Civil Service Code and public sector values, or is this responsibility assumed to be nobody's?

How to use these questions

Before you deploy any AI tool across your service, run a small pilot on genuinely representative cases and ask yourself whether the recommendations would survive scrutiny from an external auditor or ombudsman.
Write down your organisation's decision-making rules for a complex case (such as a housing allocation or safeguarding assessment) in plain language. Then ask your AI tool to apply those same rules. Compare the outputs. If they differ, you have found something the AI tool is inferring from data rather than logic.
Assign one senior civil servant to act as a sceptic for the AI tool. Their job is to question outputs, spot patterns in errors, and report regularly to leadership. Make this a formal role with protected time.
When the AI tool recommends something different from what your team member would have chosen, document both recommendations and the actual outcome. Over time, you will see whether the AI is actually better at judging these cases or whether it is simply consistent in a way that feels authoritative.
Talk to your local authority peers who are using the same AI tool. Ask them what went wrong first, not what went right. Most problems become visible within the first six months if anyone is looking for them.

40 Questions Government and Public Sector Should Ask Before Trusting AI

Accountability and Transparency

Bias and Fairness at Population Scale

Expertise and Decision-Making

Governance and Failure

Cognitive Sovereignty: How To Think For Yourself When AI Thinks For You