Back to top

Building Recommendation Engines with Big Data

In the Trenches with Search & Big Data – A Blog & Video Series

Paul Nelson
Paul Nelson
Innovation Lead

Recommendation engines play a critical role in customer engagement and retention for online media and entertainment industry. How did one of our customers use big data to efficiently process 5.4 billion clicks daily to predict and personalize videos for users?

Find out from our story below.



Mass Entertainment Made Personal

Remember when you had to tediously browse through a huge database of movies from the 1940s and who-knows-when or searching through individual media catalogs to find your favorite hits? Weren’t you hoping you could easily spot something you like from that vast ocean of options? 

The rise of social media, multi-channel entertainment, and online content sharing has increased the demand for personalized content. Have you kept returning to that one media program or app that seems to “get” your preferences without making you fill out 50+ “Get to Know You” questions? Amidst information overload, shorter attention span, and competing content, the only way to grab users’ attention is personalization. 

How do they do that? With the exponential volume of media data, recommendation engines with big data demonstrate a modern, user-centric media delivery approach through efficient data processing, machine learning, and predictive analytics. 


Combining Search and Big Data for A Powerful Recommendation Engine Architecture

Powerful media recommendation engines can be built for anything from movies and videos to music, books, and products - think Netflix, Pandora, or Amazon. 

big data recommendation enginesIn this particular big data use case, let’s focus on a video recommendation engine architecture for consumers who use set-top box (STB), which: 

  • Uses the open source Hadoop architecture as the big data foundation
  • Collects raw user data from on-demand videos, set top box activity logs, scheduled recordings, and various media catalogs
  • Processes and analyzes user log data within the Hadoop big data framework
  • Feeds the results into a search engine which then delivers unique recommendations via a user-facing browser interface

So how would this big data recommendation engine encourage higher usage, engagement rates, and user satisfaction?

At a granular level, individual user behaviors, such as the videos watched, the catalogs clicked on, the programs scheduled for recording, average video view time, etc., are systematically analyzed following a big data log analytics methodology. At a high level, this wealth of data can paint a picture of “what’s hot” for a particular user or among groups of users with similar tastes.

For marketers, this is a golden tool for mining user personas and delivering what users want. 

For data analysts and architects, this recommendation engine architecture goes beyond running SQL queries against a data warehouse to predict trends and preferences. Big data allows them to be more efficient by processing massive amounts of user data in a fraction of time compared to traditional SQL. In our particular customer case, about 5.4 billion clicks daily and six months of video view data were processed in eight hours (the previous turnaround time was 23.5 hour!). The personalized results were then loaded onto a search engine and displayed on an intuitive web application.

For end users, personalized recommendations save them the manual work of browsing through a huge database of videos. And over time, machine learning and predictive analytics enable the recommendation engine to become more accurate at predicting users' preferences, ie. boosting user satisfaction and retention.

Looking to increase that Net Promoter Score (NPS)? Good personalization lets your users sit back and enjoy their favorite videos, and lets you achieve that desired NPS number.


The No Search of Tomorrow

Using the “collaborative filtering approach,” big data recommendation platforms can predict an individual user’s preferences without significant historical data, simply by leveraging readily available data from other users with similar characteristics. 

Just like the big data personalization movement in e-commerce, content personalization for online media will eventually rely less on user’s data entry through search queries, but more on automated personalization through machine learning and predictive analytics, as we discussed in Big Data, Personalization, and the No Search of Tomorrow.


Building big data recommendation engines is a use case in our “In the Trenches with Search and Big Data” video-blog series – a deep dive into six prevalent applications of big data for modern business. Check out our complete list of six successful big data use cases and stay tuned for more video stories of organizations that found success from these use cases.

Sign up for our newsletter to get the latest updates on "In the Trenches with Search and Big Data" video-blog series.


Search & Big Data Analytics Newsletter