Back to top

Fraud Detection Powered by Big Data - An Insurance Agency's Case Story

In the Trenches with Search & Big Data – A Blog & Video Series

Paul Nelson
Paul Nelson
Innovation Lead

How did our insurance agency customer move away from complex SQL queries and leverage a big data framework to detect $100+ million in fraud, in a fraction of the time the legacy process used to take?  

Watch the case story below.



In highly regulated sectors like financial, healthcare, insurance, retail, and social security, combating fraud is essential as there is a multitude of compliance, regulations, risk management measures, and monetary consequences to be dealt with. The proliferation of modern technology has produced more sophisticated fraud techniques, but technology advancements have also enabled smarter approaches to detect fraud. In a world where transactions and documents are digitally recorded in one way or another, evidence is out there to aid investigators in the battle against damaging fraudulent schemes. The more difficult question is "how to easily and quickly find that evidence?"


Massive Data, Legacy Data Warehouse, and SQL Queries = Delays and Headaches

insurance healthcare databaseTake our insurance agency customer for example.

Traditionally, their fraud investigation team had relied on data analysts to execute SQL queries against a data warehouse that stores massive amounts of claims, billings, and other information. Due to the volume, velocity, and variety of data in the warehouse, the process could take weeks or months before enough evidence for a legal case was developed. And so, just as any other businesses, the longer it takes to detect fraud, the more losses the organization would suffer.


Shedding New Light on Fraud Detection Techniques with Predictive Analytics and Machine Learning

fraud detection with big data

Given the vast amount of data that our investigators need to sift through to find fraudulent patterns, an integrated big data and search architecture emerged as the most feasible approach. 

  • Public data, such as providers' information, codes for healthcare procedures, etc., is aggregated and processed through the big data framework, which performs massive denormalization to distribute data into multiple tables and fields.
  • The processed data is then loaded into a search engine.
  • Machine learning and predictive analytics work to pinpoint fraud red flags and proactively detect suspicious fraud schemes.
  • A search-based, graphical user interface is provided to investigators for analysis and evidence documentation.

The big data architecture enables the insurance agency's fraud detection effort to be more scalable, faster, and more accurate. Because the system really processes and analyzes every record of the available data, it also gives investigators more confidence in their findings (we like this better than sampling techniques and plain hunches!)


... And an Improved Analytics Framework that Benefits Multiple Aspects of Business

Outside of the insurance industry, the same big data framework can work with all types of log data to enable better security, fraud detection, compliance, and business intelligence. Some examples include:

  • Leveraging information from interview notes, email conversations, and social media sites; then, combine insights gained from those (unstructured) sources with official (structured) records and transactions
  • Comparing trends and detecting patterns in user behavior (think online shoppers or media viewers)
  • Identifying hidden relationships through network analysis and data correlation
  • Sourcing multiple content repositories of public records to find regulatory and compliance rules to detect red-flag patterns

And the rise of big data analytics tools like Apache Hadoop and Spark, Cloudera CHD, or Elastic's ELK stack, plus cloud storages like Amazon Web Services (AWS) or Microsoft Azure, will continue to fuel more powerful use cases of this scalable, versatile big data framework.

Fraud detection powered by big data is a use case in our “In the Trenches with Search and Big Data” video-blog series – a deep dive into six prevalent applications of big data for modern business. Check out our complete list of six successful big data use cases and stay tuned for more video stories of organizations that found success from these use cases.

Sign up for our newsletter to get the latest updates on "In the Trenches with Search and Big Data" video-blog series.


Search & Big Data Analytics Newsletter