A Perspective for Responsible Data Science and Analytics
Five Essential Tasks in a Search and Analytics Project
The recent UK bank holiday was an amazing one! I had a productive weekend with lots of fun while managing to get some gardening work done. From this hobby and my experience in data analytics, I could draw some parallels between responsible data science and gardening. Before diving into the discussion, let’s first define what “responsible data science” means. As we increasingly rely on big data to discover better insights, appropriate techniques and methodologies are key to overcoming biases and ensuring responsibility and transparency in our data projects. The Responsible Data Science consortium asserted:
“To future-proof responsible data science methods, foundational research is needed focusing [sic] on FACT, i.e., questions related to Fairness, Accuracy, Confidentiality, and Transparency.”
Consequently, “You reap what you sow” applies just as well in data science practice as it does in gardening. In the world of data science, specifically in our search and analytics space, there is a lot of preparation required before we can derive meaningful outcomes from massive data. Similarly, in the gardeners’ world, good returns only come from good planning and execution: planning a new landscape, preparing the ground, planting new seeds, predicting peak blooms or harvest time, and finally, experimenting with new varieties. Applying this analogy to business, it’s important that we consider five key tasks when embarking on a search and analytics project: Plan, Prepare, Plant, Predict, and Experiment.
Planning is about your strategy for getting from where you are today to where you would like to be in weeks, months, or even years to come. In the business world, we have fancy terminologies such as midterm planning, Vision 2020, etc.; but the gist is the same – where you are today and where you would like to be. Before embarking on a search or analytics journey, bring your stakeholders on board, document your goals, and consider employing a thorough assessment framework to evaluate your system infrastructure and business objectives. This can help you plan out your strategy in the right direction.
Once the plan is clear (in other words, when you have a clear vision), it’s time to put thoughts into action. But don’t rush; take a moment to consider what you can leverage. For instance, what’s the capability of the current search application you’ve set up in-house? Is there a toolset you may have procured as part of the last procurement cycle (maybe before the budget was frozen)? It’s important that you can reuse existing assets and build on what’s working rather than creating additional work that doesn't add much value. Again, the findings and recommendations from the initial assessment can be a practical blueprint for your next actions. Read about how an assessment can provide tremendous value and help map out your search and analytics journey. Remember our gardening analogy? Preparation is about getting the ground well-prepared for a successful harvest.
Once you’ve done the hard work of getting the ground prepared and the stakeholders engaged, it’s time to get to the actual planting: implementing your plan. So, what does this mean in the search and analytics space?
- In many cases, this could be a simple use case of acquiring and configuring your search engine, whether it is an open source (Elasticsearch or Solr) or commercial platform (Endeca, Sinequa, SharePoint Search, etc.).
- This could also be a mature engagement, such as moving to another search platform (eg. migrating from the Google Search Appliance) or transitioning to a cloud search platform like Azure.
- This could be an even more sophisticated initiative, such as applying AI to improve the existing search application, implementing chatbots for automation, or enhancing the customer experience with Natural Language Processing (NLP).
Nevertheless, the result is about making a difference – a positive one – to the way we put the data we have into good use.
4. PREDICT (= Analytics)
Analytics refers to the systematic computational analysis of data or statistics. There are various forms of analytics, such as statistical analytics, predictive analytics, and prescriptive analytics. Once done with the implementation, it’s time to predict the results and any additional value as your return on investment (ROI). In data science, making predictions is not crystal ball gazing; it is based on the actions and variants along your journey. For example, in a search-based analytics implementation, content processing, relevance ranking, and engine scoring are some of the tools that can help predict the quality of search and its impact on your business.
Analytics as a discipline was perceived in its nascent form about two decades ago. But with the growth of unstructured and natural language content coupled with increasing demand for new insights, analytics has come a long way since then. Now, analytics has got its influence in all sectors. Below are just a few examples from our work and in the public domain.
- Helping clients operate better and increase value with search-based analytics applications:
- Recent news coverage involving AI, Machine Learning, Robotics, and NLP:
Gone are the days when we had to wait for months to study the impact that new variances had on the market. Now, massive data on new interest rates, new promotional products, changes in currency exchange rates, stock market fluctuations, can all be processed and made available for search and analytics in real-time. And, even better, all of this can be achieved cost-effectively thanks to revolutionary technologies like cloud computing, big data, and AI.
In gardening, we’ve seen experiments leading to seedless grapes, seedless watermelons, and tomatoes that are grown without soil! Similarly, in data science, it requires consistent, ongoing testing of models and algorithms to incrementally improve your application’s performance. Continuously experimenting to optimize search and analytics algorithms is key – a process our architect discussed in detail here.
I would like to conclude with this saying “Don’t be afraid to experiment.” We are the gardeners of the digital era. Our ability to identify the right skills, data sources, and methodologies will influence how we can achieve the desired outcomes. Responsible data science involves diligent planning and very close collaboration with all stakeholders. Let’s plant the right seeds (people, vision, frameworks) in the right conditions (infrastructure, technologies) so that we can harvest sustainable outcomes that bring greater benefits to businesses and individuals.
We’re very excited to announce that we’re now part of Accenture! Read the announcement here.