Image of people discussing the data and analytics on a laptop screen.

The NHS Counter Fraud Authority (NHSCFA) is an intelligence-led organisation... intelligence that is born of information. Any information that the NHSCFA receives in the form of data is analysed by the Information Analytics (IA) team. This article is a brief summary of the work we do to access and use NHS data.

Our process has been recognised as an example of best practice and has been adapted to contribute to counter fraud analytics guidance produced for the wider public sector by the Cabinet Office. Their guidance is now available and has been presented at major counter fraud events, including the Counter Fraud conferences of CallCredit (now Transunion) and Portsmouth University (you can request it directly from the Cabinet Office by emailing fraud.data@cabinetoffice.gov.uk ).

Data sources

Only a small part of the data we analyse is owned directly by the NHSCFA. This is data from our own data collection sources such as the Fraud and Corruption Reporting Line, our online fraud reporting tool and FIRST. FIRST is a case management system used by Local Counter Fraud Specialists (LCFSs) and NHSCFA investigators.

The vast majority of the data we use for analytical exercises must be sought and gathered from external sources. In the past, this has included dental treatment data (to identify split treatment claims), NHS trust invoice data (to identify duplicate invoices) and patient list summary data – so the context can vary greatly!

Some data is open source or published, and as such, is readily available from other organisations (such as NHS England and NHS Digital). Other data requires information sharing agreements and formalized data shares. No matter where we source our data from, whether it’s in the NHS or outside of it, we need to make sure we seek and obtain it appropriately. We therefore have a strict process to identify, access, and use data effectively while meeting our responsibilities in terms of how the data is gathered and used.

Aims and objectives of data gathering

In the initial stages of planning a project, we set aims and objectives based on the NHSCFA’s strategic intelligence assessment (SIA). We define our broader aims based on the fraud risks identified in the SIA, and set more detailed business plan objectives to tackle those risks.

Objectives within wide thematic areas (e.g. procurement fraud) will need to be broken down to identify the various possibilities for fraudulent behavior that may exist within the area (e.g. duplicate invoicing, invoice splitting, supplier fraud). This enables us to determine the most effective and productive type of exercise to identify fraud.

We always need to be cautious when using the term 'fraud' in relation to any findings from data analysis. First of all, administrative errors and other non-malicious activity can appear identical to fraud in data outputs. Secondly, the burden of proof required to establish an offence of 'fraud' can only be met in a court of law. For these reasons, we often use the term 'outliers' to refer to findings which may be indicative of fraudulent behavior and may merit further investigation.

Research and planning

Once aims and objectives have been agreed, research is carried out on the subject and the data that is available in relation to it. Some of the questions we tend to ask are:

  • Do we know what data is needed to meet the desired objectives?
  • Who owns or produces this data?
  • Is the data open source and/or published and within the public domain?
  • Does the data contain personal data about NHS staff, patients or contractors?

We also need to plan carefully to make sure we comply with legislation and meet all relevant requirements in areas such as data protection. Here are some of the issues we have to consider:

  • If the data is open source, published or otherwise contains no personal data then access is much simpler and can usually be gained without much difficulty. However, this is rarely the case and more often we have to seek it from the data owner directly.
  • If there is personal data included, we need to carefully consider if the data is necessary for our exercise or if it can be removed or refined to reduce the amount gathered. If this is not possible, and the data is necessary, thought must be given to each of the principles of the Data Protection Act 2018.
  • It is also crucial that a secure method is used to receive and share data, and after receiving the data we need to transfer it to a usable format. As part of the analysis process we may match data to other datasets, e.g. Ordnance Survey data, post office address file, local authority address data and death data.
  • We also need to check the accuracy of the data. This is achieved through a process of peer review.

The final product

Once we have analysed the data, we have to set out our conclusions in a way which enables recipients to understand what they mean as well as the context. This will include information about the scope of our analysis, any caveats that apply to the findings and indications on how to interpret the data or apply the conclusions appropriately.

Recipients may be colleagues within the NHSCFA or external stakeholders –we work with these recipients to assess whether the results of analysis support further work, in terms of additional analysis or an extension of the scope.

In conclusion, data analysis is a very important part of what we do here at the NHSCFA; data and information allows us to put together a more complete picture of what fraud looks like in the NHS, helping us to fight it and prevent it.