Back to top

How Big Data Helps Online Publishers Boost Revenue and Retention

In the Trenches with Search & Big Data – A Blog & Video Series

Paul Nelson
Paul Nelson
Innovation Lead

How can our publishing customers use a big data framework to improve search, personalize content, and continuously test search engine performance to optimize subscription revenue?

Watch the video story to find out.



The publishing industry has seen its traditional business model disrupted by rapid technology advancements for both publishers and readers. With the rise of e-books, e-readers, and mobile apps, publishers have entered a race to optimize their digital fronts to attract online audience. Just like music and videos, the easier users can find and browse content, the better the user experience. And if you have a base of loyal users or subscribers, this is key to greater revenue and retention rate. Many online publishers have started to combine big data and search to:

  • Collect, organize, and index content from multiple repositories, making search quick and simple for online users, whether they're on mobile or desktops.
  • Systematically analyze search logs and on-page behavior to "learn" user preferences and personalize content.
  • Optimize user experience and sales revenue through search engine testing and scoring.


Connecting Multiple Repositories for Better Search 

With more volume and variety of content (research, books, articles, you name it), today’s online publishers can appeal to a diverse base of audience and subscribers. But because it's now easier to gather content from multiple sources, a new challenge has emerged: once you've got all that high-value content, how do you manage it and provide your online users a Google-like search experience? After all, if they can’t find it, they can’t subscribe or buy, right? So for online publishers, good website search plays a critical role in customer loyalty and the publisher's bottom line. And good website search starts with proper data processing, enrichment, and indexing, all of which can be done systematically through a big data architecture.


Personalized Content with Log Analytics and Machine Learning

online publishing with big data

Think of ProQuest, Bloomberg, or the Business Journals. How many visitors and subscribers visit their websites every day? Every hour? How can they really leverage that huge amount of data for better search and personalization?

Big data works for online publishers just like how it does for video recommendation engines and e-commerce personalization we’ve discussed in this series’ earlier posts. Our online publishing customers have seen success using this common big data framework which:

  • Collects raw user and content data from subscribers’ profiles, search queries, downloaded documents, authors, metadata, taxonomy setup, etc.
  • Processes and analyzes user log and content data within Hadoop.
  • Feeds the results into a search engine which then delivers relevant results and unique recommendations via a user-facing browser interface (website or mobile app). 

And with automated machine learning, we can conduct powerful cross analytics between user browsing habits (unstructured data) and content (structured data) to bring a truly personalized search and browsing experience.


Optimizing User Experience and Publisher's Revenue

You already know - search is not a one-time effort where you set it up and expect it to run perfectly afterwards. For online publishers, it's essential to have hard metrics to evaluate the correlation between search performance and revenue as more user logs and content are collected.

Search engine scoring is a practical and measurable technique for continuously enhancing search engines' performance, day by day. Though it requires patience and repetitive modifications, a nice thing about it is that we can do all the scoring and testing offline, ensuring that things work properly before going in live production. This has tremendously reduced business disruptions and stressful planning for our customers' technical teams while increasing end-user satisfaction. For an insider's look into how we perform search engine scoring, check out this post.


Using big data in online publishing is a use case in our “In the Trenches with Search and Big Data” video-blog series – a deep dive into six prevalent applications of big data for modern business. Check out our complete list of six successful big data use cases.

Sign up for our newsletter to get the latest updates on "In the Trenches with Search and Big Data" video-blog series.


Search & Big Data Analytics Newsletter