Search and Big Data are now Mission-Critical for Business
“Google for the enterprise.”
This has been our catch-phrase for something we call “Corporate-Wide Search” ever since 2008 when we started seriously focusing on search engines for corporations. It means providing a “google-like search” to all corporate resources.
And it makes sense, right? After all, Google is the gold standard for search and who wouldn’t want that level of search for all employees inside the corporation?
But in recent engagements working with our customers, I have realized: We can do better.
The truth is that “Google for the enterprise” is really not even an appropriate analogy for what corporations really need. Will it drive revenue? Will it identify waste and reduce expenses? Yes, of course, but it is a soft ROI and the connection to the bottom line is indirect.
What is Google.com anyway? It is a search engine designed for public usage by naïve users to search web pages on the Internet.
- Do companies have naïve users? – NO
- They are sophisticated, highly-skilled employees trying to execute the business.
- Do companies need to search web pages? – YES, BUT THE NEED TO SEARCH FOR BUSINESS DATA IS GREATER
- Web pages typically make up a small fraction of a percent of the data within a company.
- Where is most of the data?
- Large content management systems, and
- The data warehouse.
- Do companies want occasional usage? – NO
- Not if search is part of a critical business function and can help make money and reduce expenses.
FROM DOCUMENTS TO BUSINESS DATA
From the very beginning, text search engines have always been about “searching for documents.” The first applications were for lawyers (early eDiscovery) and publishing (academic papers and news).
And so naturally, search engine people go to companies and ask them: “what documents do you have to search?”
Of course, companies do have a lot of documents to search, such as marketing research, product research, memos, reports, RFPs, SOWs, policies and procedures, corporate news, and on and on. And so the search people were happy. “We found documents to search”, they said to themselves. “Let’s go search those documents.”
But all along we were asking the wrong question. The question we should have been asking is this:
“Can we help you search business data to improve the bottom line?”
Aha! Now we’re getting somewhere. When you start asking this question, you get much more interesting answers:
- “If we could search manufacturing data to identify deviations we could fix problems faster, waste less product, and save a lot of money!”
- “If we could search sales data for trends to exploit we could push a lot more product!”
- “If we could search insurance claims for fraud we could reduce rates for everyone!”
- “If we could search our lab notebooks for drug indicators, this would identify new products that we could sell!”
- “If we could search work tickets for common problems (or common equipment failures) we could fix problems before they occur, and before they become an expensive crisis!”
This is powerful stuff. We’re now talking about search having a substantial and direct impact on the bottom line: “catch fraud”, “eliminate waste”, “sell more product”, “identify new markets”, “make new products”, etc.
DATA SURFING - THE EXPERIENCE OF A LIFETIME
I’ve seen this work first-hand at a large worker’s compensation insurance company. When we first turned on search for their health-care claims bill-lines data, the experience was unexpectedly powerful.
We talk about “surfing the web” all the time, but using Google and clicking around web sites has never really felt like true “surfing” to me. I’ve done the real thing a few times in Hawaii, and browsing web sites feels more like poking around a vast urban wilderness than true surfing.
But once we put together our first search engine application for analyzing business transactions, I got to experience true digital surfing for the very first time -- and the experience was thrilling. Masses of data, at my fingertips, driving graphs, charts, maps, filters, facets and results, and all dynamically updated with a few clicks or keywords. It was amazing. It was like being the sorcerer in “The Sorcerer’s Apprentice”.
And it really felt like surfing, like getting up on a board with ocean swells thrusting you forward.
Not the sort of experience you expect to have at a worker’s compensation insurance company!
BIG DATA AND SEARCH: TWO PEAS IN A POD
Big Data is the driver that enables us to ask these new questions. Big Data is all about scale. Scale to very large databases beyond what SQL and traditional relational database tables can handle. This is why No-SQL exists. Not because it’s more powerful than SQL (it’s not), but because it can scale.
And it all started with log files. Big Data allows us to process log files to extract useful business insights from the billions of clicks that users make on web sites. We used to ask “should we put the diapers next to the beer?” Now we ask “do people click on the picture of diapers after buying beer on our eCommerce site?”
When Big Data needs ad-hoc query, What to do? It can’t use SQL because the size of the database is much too large. SQL queries take forever (all those joins kill scalability). And besides, SQL is not end user friendly.
Enter search to save the day.
Search can scale to any size database with a share-nothing architecture. We once indexed and searched a billion tweets over a weekend (no kidding). We are searching 100+ million bill lines (each with over 200 fields) in a health care claims system in a few seconds and returning all sorts of statistics. We’re building data visualization for multiple customers, showing business transactions across graphs, maps and timelines, using search.
Like No-SQL, search is not as powerful as SQL, but search is much easier to use and it’s getting more and more powerful every day with increasingly sophisticated query operators, data analytics, and visualization tools.
What does this sound like? Sophisticated query + data analytics + visualization @ enormous scale? It sounds like business analytics and business intelligence over the data warehouse. Big Data has already turned its focus to taking over the Data Warehouse. More Big Data vendors have off-the-shelf connectors to standard Data Warehouse systems than ever before.
And now search is moving into the data warehouse market as well (thank you, Big Data!), providing fast, ad-hoc query and analysis over raw business data.
Search is now ready to answer the important business questions that have bottom-line impact.
SEARCH FOR BUSINESS ANALYTICS AND BUSINESS INTELLIGENCE
The following diagram shows how search is being used in business analytics and business intelligence:
The major steps in this process are:
ACQUIRE DATA – Acquire data from business systems. This includes:
- Set up connectors to pull or receive data from business systems, including:
- Data warehouse data
- Business transactions from logs
- Work order and work tickets
- Scanned documentation (invoices, bills, etc.)
- Call center logs and customer transactions
- Sales CRM transactions, communications, and customer lists
- Public information (watch lists, industry news, competitor news and transactions)
- Understand and set up the security model for the data (document ACLs, user group membership, other security rules)
- Set up incremental updates to maintain data synchronicity
PROCESS – Process data as necessary to make it searchable. This includes:
- Denormalize RDBMS tables as necessary for search convenience
- For example, to convert JOINs to multi-valued fields
- Combine records from multiple systems into “entity views” (360° view)
- Note that security restrictions must flow with the data, potentially converting document ACLs into field ACLs for larger, combined records.
- Enhance data as necessary to improve search, this can include:
- Simple analytics: summary counts, totals, distance calculations, joins and comparisons
- Complex analytics: predictive analytics (more below), machine learning, outlier analysis, red flag analysis, entity extraction, sentiment tagging, latent semantics, categorization (e.g. industry sector tagging), topic analysis, collaborative filtering (recommendations analysis), etc.
EXPLORE & ANALYZE – The search engine with a data analytics user interface is used to explore the data. The goal is to identify immediate actionable insights (markets to pursue, products to make, equipment to inspect, divisions or groups to audit, etc.). This includes:
- Locate and examine individual records of interest to get a deep understanding of the underlying data.
- Identify clusters and data trends using facets and other search-based analytics (summary statistics, pivot tables, heat maps, graphs, trend lines, etc.)
- Such analytics may require adding custom operators to the search engine, to implement business-specific analysis within a search framework.
- Use multiple cascading searches for complex analysis and comparison of large sets of data
- Compare the results of multiple searches to find data correlations.
PREDICT – Large-scale predictive analytics can be run in batch on the Big Data framework.
- Validated insights from the explore step can be incorporated into a predictive model using machine learning algorithms.
- This will be run by Big Data to automatically produce predictions across all data.
- Predictive models can also be applied on incoming data, to immediately identify business actions as soon as they occur.
+ For example, to identify potentially fraudulent actions for which immediate action should be taken.
- Predictions will be added to the entity data and stored in the search engine, to be used for further exploration and analysis.
REPORT – Reports and graphs are produced directly by the search engine
- These can be incorporated into formal documents for distribution to business owners as needed.
INCORPORATING BUSINESS ANALYTICS INTO KNOWLEDGE MANAGEMENT
In addition to simple navigation, Corporate-Wide Search has also been a conduit for “knowledge management” – managing and distributing the knowledge and which exists within the corporation.
We are now starting to see that traditional knowledge management – through search – will expand to include thecreation of new knowledge through analysis of business transactional data:
In this view, “knowledge management” becomes the framework in which business employees leverage and expand past work as well as add new insights into corporate knowledge base(s).
With the inclusion of deep analytics on raw business data (transactions, sales, manufacturing data, etc.) we can now identify very clearly the business impact of a knowledge management system and how it fits into the larger goal of continuous improvement, growth, and efficiency for the business as a whole.
And search provides the connection point. It acts as a common interface to both business analysis and knowledge search and re-use. It can encourage good behavior on both fronts.
In this way, we believe that corporate Knowledge Management will graduate from a helpful set of tools into a mission critical business process.
BUT I STILL NEED "GOOGLE FOR THE ENTERPRISE"!
Of course you do! Everyone needs a google-like interface to navigate through a corporation to identify corporate resources and destinations. Thanks to the consumerization of IT, it has become expected by employees and executive management.
We are not saying that simple, navigational search goes away. Just that it now becomes part of a larger business process. Search Technologies envisions an expansion of search to include all data (structured, unstructured, transactional, written, log files, manufacturing instrumentation data, documentation, procedures, images, video, etc.) which exists within the corporation:
In this role, the traditional “single box, google-like search interface” now has multiple goals:
- Goal 1: Get to “things I need to do my job”
- Goal 2: Get to “things I know exist which I need right now”
- Goal 3: Get to "the data which can solve my business need"
It is this last point (goal #3) which extends the traditional search box into a mission-critical tool. We see the “google-like” search box as not just getting you to a destination, but also identifying data sets which meet business needs which have direct impact on the bottom line.
- Find past research which solves my business need
- Identifying data sets which contain the data that can be analyzed to solve my business need
- Providing search and analytics over business data to solve my need
In this way, the google-like search becomes both a distribution channel for business data to the employees who need it to solve business problems, and the entry-point interface for the analysis of that data to extract new insights that help drive the business forward.
The purpose of this blog entry has been to lay out a vision of how search will drive business decisions. Previously, "full text search engines" were only used to search for documents and web sites. Now, search engines are searchingeverything and are replacing SQL as the de-facto standard for all business access, including business transactions, data warehouse data, equipment instrument data, etc.
This is happening right now. Search Technologies is involved with multiple companies who are exporting their business transactions from data warehouses into Big Data and Search Engines. When this happens, the end result is nothing short of astonishing. The data becomes so much more malleable, tangible and alive, it truly takes your breath away. It was this gut reaction which made me realize:
"This is it - this is the killer application for Big Data. Search + Big Data will become the mission critical application for business."