Improved Government Document Search with an Open Source Search Engine
Solr Search Engine Enhances Search across Millions of Public Records
The Challenge with Government Document Search
Government documents provide a critical resource for research, studies, and a multitude of government information discovery purposes. But searching government documents can be very challenging considering the volume, variety of document structures, linked citations, etc. To make government documents easily accessible to the public, a high level of care is required when designing the underlying search infrastructure.
Search engine architectures built for government document search must be highly functional and scalable. It may require heavy-duty data preparation to be flexible enough to support numerous deviations from standard metadata extraction algorithms.
And with the challenges we developed methods and best practices along the way, so that we can apply them to our public sector work over the years.
Making Public Access to Government Information Easy
Helping the public gain faster, easier access to government information is a key objective of many government agencies and public sector organizations.
Our team at Search Technologies has recently worked with a government agency to refresh its information portal. While the agency's legacy search user interface (UI) efficiently served its purpose, as demands for advanced but intuitive search grow, they looked to transform it into a next-generation information discovery platform. Combining a powerful search engine with extensive metadata creation for optimal search relevancy, the new search platform aimed to provide public users an easy and quick search experience across dozens of government information databases.
From a Legacy Search Solution to the Open Source Solr Search Engine
Among the main goals for the new search platform was a cleaner, simpler interface while at the same time making search faster and more user-friendly. The new interface would maintain all the existing advanced features to support the experienced users, while making search more accessible to the general public.
A thorough assessment of the agency's search infrastructure resulted in their decision to migrate from their commercial solution to an open source search engine for greater agility, flexibility, faster deployment, and cost savings.
We established that the open source Solr search engine would closely cater to the agency's needs for the following reasons:
- Solr would support the current index size, growth rate, and query performance on the existing hardware.
- The content feeding and search API components can be upgraded in a manner that is backwards compatible, so that switching to Solr would be nearly transparent to all the other components of their system.
- Solr supports all the front-end features (such as browse and faceted search), and easier customization will provide support for more UI features, giving users an improved search experience.
- As an open source search engine, Solr allows the agency to cut licensing costs while being able to scale to millions of documents.
The Next-Generation Platform for Government Document Search
The seamless switch from the legacy commercial search engine to Solr delivered the same search experience to the portal's responsive front-end search UI, plus improved search features, such as:
- New options for browsing government documents, including alphabetical, category, date, etc.
- Expandable and collapsible search widget for quick access to different types of searches.
- “Related Documents” feature for easy browsing of documents in the same category.
Want to learn more about the best practices for archiving and preparing government documents for search? Read A Search Architecture for Archived Government Documents white paper.