Library of Congress: Cataloger’s Desktop 4.0
Serving 10,000 Librarians and 1,000 Institutions Worldwide
The Library of Congress ("LC") is the United States’ oldest federal cultural institution and the largest library in the world, with millions of books, recordings, photographs, maps and manuscripts in its collections. LC also provides leadership to libraries throughout the world. As part of its mission to provide services to the library community the Library of Congress developed the Cataloger’s Desktop (“Desktop”) application, which is a searchable information delivery system consisting of 300+ pre-selected information sources.
Desktop’s objective is to help library professionals quickly find the information they need to create bibliographic metadata for incoming library resources. It is used by over 10,000 librarians at approximately 1,000 subscribing institutions worldwide. Since its initial release in 1994, Cataloger’s Desktop has evolved into a widely used and authoritative service on the web that allows professional catalogers to work more efficiently with the most up-to-date cataloging information at their fingertips.
Search Technologies has been supporting the Cataloger’s Desktop since 2009 and helping improve the application through the years. This case study describes projects and enhancements we’ve implemented, starting with the most recent updates.
Log Analytics and System Auditing with the Elastic Stack
System logs capture anonymized usage data of the Cataloger’s Desktop system. Those logs are indexed using the Elastic Stack components: Beats, Logstash, and Elasticsearch. Custom reporting components deliver monthly and on-demand reports of system usage to LC.
Further system auditing and reporting deliver notifications of aggregate usage of the system and anomalous events to LC and Search Technologies' Managed Services team, helping them ensure performance and security.
Suggestions and Recommendations
We also implemented a Suggestions and Recommendations Service for Desktop, which provides Desktop users with type-ahead suggestions of query terms and resources meeting their needs. Type-ahead suggestions are generated by matching keystrokes entered during searches. Recommendations are generated by analyzing past usage patterns of users on the system. Apache Spark is used to process application log entries to generate related search terms and related documents that met the needs of previous users.
- Fast type-ahead search as users enter query terms
- NGram, EdgeNgram, and Fuzzy matching - match portions of terms and support spell correction
- Customizable relevancy - relevancy model is tuned for Desktop users weighting frequent query terms and metadata from the content over less frequently applied terms
- Personalized query suggestions
- Section 508 compliant user experience using JQuery
- Saved searches and scheduled alerts
Aspire Text Analytics Components were used to enrich metadata in Desktop to help users find cataloging documentation related to various types of materials such as books, maps, electronic resources, music, or artworks.
Read more about how these new enhancements were implemented and the project outcomes in our blog.
CATALOGER'S DESKTOP 4.0 - 2014 ENHANCEMENTS AND MIGRATION TO SOLR
Search Technologies has supported and hosted Cataloger’s Desktop since 2009, when the solution was powered by FAST ESP and ProPublish. In 2014, Search Technologies worked with the Library of Congress to complete a migration to a new platform based on the open-source search engine Solr and a completely new user interface built from the ground up.
Migration objectives included leveraging newer technologies, migrating to a new hosting platform, and making noticeable improvements to the user experience. Several of the enhancements are based on findings from interviews, focus groups, and surveys with Desktop customers.
Summary of Version 4.0 Improvements
Deployed in 2014, Cataloger's Desktop Version 4.0 improved upon earlier versions of the product with the following performance and usability improvements:
- The system was migrated from the legacy FAST ESP search engine to the open-source based Solr search engine.
- Solr was deployed and enhanced with Search Technologies' Query Processing Language (QPL).
- Search Technologies’ Aspire Content Processing Framework is used to crawl and index over 300 web and static resources in Desktop.
- An entirely new user interface was deployed, based on requirements gathered from focus group sessions with Desktop customers.
- The user interface was redesigned using responsive design principles to support Desktop on a variety of screen sizes and tablet computers.
- Cataloger’s Desktop 4.0 was also migrated to a cloud-based hosting environment (Amazon EC2), which provides LC flexibility and reliability as the computing needs of the application expand.
- Search Technologies’ Managed Services team provides hosting and support for Desktop 24 hours a day, 7 days per week.
Cataloger’s Desktop subscribers now have a user-friendly interface and experience extremely fast response time. The Library of Congress has a flexible, reliable, and versatile foundation for delivering Cataloger’s Desktop for years to come.
Cataloger's Desktop - Search, Browse and Presentation Services
Search features for Cataloger's Desktop include keyword, wildcard, and phrase search, boolean operators and nested expressions, spell checking, multi-term synonym support, and hit-highlighting.
Navigable facets for search results include source document, cataloging task, material type, publisher, resource type, language, country and new resources.
Personalization through the use of bookmarks, saved searches, saved sessions, and shortcuts also makes for an improved user experience.
Browsing the collection through a table of contents allows users to navigate quickly to specific resources and documentation.
Presentation services allow users to view the full text of documents within a web browser or tablet pane. The responsive interface allows users to navigate the contents of a resource, focus a search within its contents, and follow hyperlinks between documents.
CATALOGER'S DESKTOP 3.0: GOING ONLINE WITH IMPROVED SEARCH
The original Cataloger's Desktop consisted of ten LC print publications, which were eventually incorporated onto a CD-ROM. In 2008, The Library of Congress determined that an upgrade to its Cataloger’s Desktop 2.0 service was needed. Objectives of the upgrade were:
- To move the system to a newer Web-based delivery model
- To expand the number of available cataloging resources to include many that were only accessible on the Internet
- To modernize and improve the performance of the basic search and browsing features
LC also recognized that available search and discovery capabilities such as faceted navigation, “more like this,” fuzzy matching, personalization, and advanced relevance ranking techniques would greatly improve the service’s effectiveness for the widest range of users. In addition, the library envisioned many enhancements to the Cataloger’s Desktop user interface to increase productivity for professional catalogers, and planned to incorporate additional data sources such as RSS feeds from the Library of Congress and other appropriate sources. In 2009, the results of this initiative were delivered in version 3.0 of Cataloger's Desktop.
Summary of Version 3.0 Improvements
Deployed in November 2009 with an impressive list of search and discovery enhancements:
- Implementation of FAST ESP search platform
- Fuzzy matching (matching spelling variations and misspellings)
- Finding/excluding similar resources based on an example
- Faceted navigation and dynamic drill-downs, including hierarchical organization of some resources
- Hierarchical table of contents navigation within larger documents
- Contextual analysis
- Advanced relevance ranking techniques
- Search histories
- A search experience that adapts to a user’s search behavior
Search Technologies continues to work with the Library of Congress to ensure that the Cataloger's Desktop application continues to deliver a good user experience in terms of usability and performance. The success of the program is often noted in the many compliments that LC receives from librarians around the world on invaluable the application is to their jobs.