Back to top

Search and Big Data Analytics in 2016

A Recap of Our Top 7 Highlights of the Year

Search and big data analytics have evolved significantly over the last few years. At the beginning of 2016, we predicted that machine learning and search engine scoring would be key developments in this space. Throughout 2016, we’ve seen things further unfold through multiple projects we worked on and through understanding our customers’ expectations for their data-driven applications. Here are some highlights of our seven most popular topics and discussions throughout the year.

open-source.png1. The Rise of Open Source

We’ve seen open source technologies becoming prominent in multiple use cases, from traditional enterprise search to log analytics, e-commerce search, and even government document search. 

In fact, data showed that open source search engines have gained significant popularity. According to DB-Engines, Elasticsearch and Solr – two open source search engines based on Lucene – are topping the list of the leading commercial and open source search engines. Want to learn more about how the two engines compare? We discussed the features and limitations of Elasticsearch and Solr in detail in this blog

Let’s take a look at a couple of examples illustrating this point.

In the commercial space, Splunk has been the “reigning emperor” for enterprise log analytics, but there have been movements in the open source direction with the Elastic Stack. One of our customers settled their case by migrating their log analytics application to the Elastic Stack, which enabled them to cut costs significantly while increasing conversion and agility. If your organization is weighing between the Elastic Stack vs. Splunk for your log analytics use case, here’s a useful comparison blog.

In the public sector, we were proud to launch an enhanced – a public information portal run by the US Government Publishing Office (GPO), migrating them from a legacy proprietary system to the Solr open source search engine. The new platform provides greater features, flexibility, and cost savings for the GPO.

gsa-discontinued.png2. Google Search Appliance Discontinued

In early 2016, Google announced its end of support for the Google Search Appliance (by March 2019) as part of its strategic move to a cloud-based platform. Since then, GSA users have started to look for a GSA replacement solution. We’ve been keeping our customers in the loop for any new updates from Google, and at the same time, helping many of them prepare and seamlessly implement their GSA migration. 

If you are a GSA customer and still considering your migration strategy or evaluating your replacement solutions, you can use our Top 10 Criteria for Selecting Your GSA Replacement e-book as a guide.

This year, we also introduced the Search Technologies’ GSA Replacement Solution - an optimized reference architecture for search and analytics applications, developed from our decade of experience and is built around our technology assets and open source technologies. Download our white paper to read about our GSA Replacement solution in-depth.

data-lake.png3. Enterprise Data Lakes

Enterprises have a lot of data but how well they use it to derive insights is key to success. There’s a lot of hype around enterprise data lakes (or enterprise data hubs) to bring together data silos and make the right data available to the right users at the right time. But how exactly is this done?  

There is a wide variety of structured and unstructured data in enterprise data lakes. Search engines are the ideal tool for storing, processing, accessing, and presenting this data because they are schema-free and can scale to billions of records. 

Data lakes’ search and analytics capabilities are nearly endless when we combine search engines, Hadoop, and visualization dashboards in use cases such as:

  • Bioinformatics – providing data access to business users in near real-time and improve visibility into drug manufacturing and research processes
  • Agriculture – boosting agricultural productivity using predictive analytics based on millions of records of historical data

In our recent Enterprise Data Lake webinar, our Chief Architect discussed this topic in-depth with a live demo and customer case studies. Watch the on-demand webinar here.

precision-medicine.png4. Precision Medicine 

How much would health care and research institutions benefit from a research dashboard application that integrates and visualizes clinical data, genomics data, and literature into a unified web-based interface that allows researchers to perform cross-domain research studies and corroborate phenotypical with genomic data? This is exactly what we’ve done for a West Coast research institution, and intelligent chromosome-based data sharding has led to substantial performance improvements compared to prior attempts to deliver this solution.

The infrastructure supporting this application is based on a search engine (Solr) running in Cloudera’s big data platform and is coupled with modern web technologies for the user interface. 

The most popular research informatics software packages were designed and built before the newer big data technologies emerged. As such, healthcare organizations and research institutes struggle today with processing and extracting value out of the increasing deluge of genomics data triggered by the decreasing costs of genomics data sequencing.  

However, our recently developed application has helped our customer’s principal investigators to: 

  • Analyze and visualize the structured data
  • Search over genome annotations data containing full-text
  • Speed time to discovery of cures for children’s diseases
  • Ensure that the client can obtain funding more easily to pursue these cures

engine-scoring.png5. Search Engine Scoring

Analyzing statistically valid scores helps increase search engine relevancy over time. While this method can significantly increase business value and the bottom line, not many organizations have started engine scoring or are implementing it effectively. 

With that said, we’ve observed proven success with newer, better algorithms used in the scoring process. So this is a solid technique which will continue to grow in the search and big data space in the coming years. 

To get an idea of where you can start or how to improve your search accuracy with engine scoring, view our on-demand webinar with a live demo here.

regulatory-compliance.png6. Financial Regulatory Compliance 

In today's digital age, data provides a wealth of insights to businesses. On the other hand, the huge volume of data also creates challenges to manage and use effectively. For many organizations, the ability to deliver transparency through data is critical, not only for operational success but also for meeting stringent industry standards.

In the financial services sector, there is a legal requirement to retain data and make it available and searchable for a period of time. As a result, financial organizations often incur substantial costs to provide sufficient data transparency to remain compliant with multiple regulatory requirements.

Watch this video to see how we leveraged search and big data technologies to build this regulatory compliance platform for a large financial institution.

personalization.png7. The Personalization of Search

Just like Google, Cortana, Siri, and the likes, search is becoming much more than just keyword matching. We’re now heading into the age of search results personalization. Search engines are becoming personal digital assistants, or as Gartner calls it, Insight Engines. This was made possible with big data analytics techniques like machine learning and predictive analytics. And it’s pervading the modern business world in a multitude of use cases. We’ve seen great customer success in use cases from intranet search, to e-commerce search, recruiting, medical research, media & publishing, and many others. 

  • Learn more about some of the popular use cases
  • Read about how we leveraged search and big data to build these Insight Engines 

With the rise of open source, massive volumes of structured and unstructured data, and the need to do complex analytics, search has become an integral part of the big data revolution. That shows in many of our exciting customer success stories enabled by search, big data, and analytics. In 2017, we expect to continue seeing a growing number of use cases where search and big data converge, as our Chief Architect put it: we’re “moving search engines into the age of enlightenment.”