Open Source eDiscovery Solutions


Search Technologies maintains a services practice which builds custom eDiscovery applications based on major search engines, including the Apache Solr open source platform.

Search Technologies provides complete, project managed eDiscovery solutions from requirements review and architectural design, through to implementation, project management, and ongoing system maintenance and support under a commercial-grade service level agreement.

With the addition of professional services to customize the application to the specific needs of client, open source now provides a highly cost effective option for serious eDiscovery solutions. Further, where client needs are quite specific, open source affords great flexibility, and being open source-based, the client is not subject to vendor tie in.

Clients can freely choose to take ongoing support in-house, or to contract with competent 3rd parties as necessary. eDiscovery solutions can be built using the following Apache Lucene open source components.
  • Lucene: A core indexing and search API
  • Solr: A server packaging for Lucene providing a wide range of enterprise search functions and a convenient RESTful/xml interface
  • Tika: A document filtering / content extraction utility
  • Mahout: A machine learning add-on for Lucene supporting clustering, collaborative filtering, patternset mining, random decision forests and support vector machine implementation.
  • Nutch: A crawler for use inside or outside the firewall In some specific cases,
For some eDiscovery applications it may be pragmatic to use a mixture of Apache open source components and one or more licensed products. Search Technologies provides independent advice in this matter, on a case by case basis.