Federated Search 101

Alexis Linoski and Tine Walczyk give a nuts-and-bolts overview of exactly what to look for when considering a federated search tool

Federated search is a starting point. It’s a launching pad for users, a tool to help them identify the databases that are best suited to the subjects they are researching. It has other names as well—metasearch, parallel search, broadcast searching. It allows users to search across multiple resources: subscription databases, library catalogs, and websites. In fact, chances are you’ve already used something just like it.

Google Scholar is a federated search-like tool that many librarians are now embracing rather than fearing. WorldCat.org is a more library-oriented example that can search from one box all resources—articles, books, journals, DVDs, and CDs. While Google Scholar doesn’t offer much beyond a results list, WorldCat’s “Refine Search” column uses faceting to guide users in further narrowing their search. WorldCat even offers plug-ins for Facebook and Firefox, allowing users to connect to resources through a friend’s profile, or search for items directly from a web browser search bar.

Many vendors also offer a similar service for searching across their subscription databases. Some examples are EBSCO, CSA, and ProQuest. In fact, many of the subscription vendor products, such as ProQuest Research Library, can have a default setting of “search all databases,” which many librarians prefer. However, these subscription vendors frequently have an assortment of disparate databases on a variety of topics—everything from the most general of academic databases to ultra-specific topical databases in business and the sciences. And while searching dozens of substantially different databases is beneficial in many contexts, it’s important for librarians to keep in mind both the strengths and weaknesses concerning the federated search approach, yet many librarians fail to consider this (see Melissa Rethlefsen’s “Easy ≠Right,” p. 12, for more on these concerns). Their assumption seems to be that the more you can search at one time, the better the results for the user, which is not always the case.

On the market

Today’s marketplace has a number of highly visible products. You’ve probably heard of several of the more common federated search vendors and products, such as 360 Search by Serials Solutions, which recently acquired WebFeat as well, and MetaLib by Ex Libris. But what you probably don’t know is that there are more than a dozen federated search products on the market. Furthermore, many vendors also license their products or connectors to other vendors or resell another product. An example of this is EBSCO’s licensing of WebFeat technology. (See the Link List, p. 4, for vendors.)

Last year at the American Library Association annual conference in Washington, DC, we visited the majority of these vendors. We had not heard of many of them, but after visiting with them, we realized why. Many are strong in either academic or public libraries. Since, until recently, we both worked in academic libraries, our knowledge of vendors popular in the public library sector was limited. Nonetheless, we jumped right in and got up to speed. We couldn’t find a product we didn’t like. They all have their strengths, and the right choice will boil down to the features best suited for your library and users.

Apples to apples

When considering a federated search product, it is extremely important to evaluate the “must-haves” before purchasing. All too often people get sidetracked by the flashy interface, doing something that’s “unique,” or just going with an existing vendor because it’s easier. In fall 2007, we conducted a survey of librarians familiar with federated searching to discover what product features were most important to them. The top five features reported were limiting results by source, date, range, full text, or peer review; the speed of results retrieval; the appearance of the interface; sorting of results; and deduplication.

Also, when starting the product comparison process, make sure to compare apples to apples. Some vendors offer both a federated search product and a discovery tool. Discovery tools, such as Encore from Innovative Interfaces and Primo from Ex Libris, incorporate Web 2.0 tools, federated searching, and portal elements. To that end, they are also typically more full-featured and more expensive. Although these are excellent tools, they are not really comparable point by point with single-purpose federated search engines.

All of the current generations of federated search products basically do the same thing but do it in different ways. Before selecting a product, consider how it will be used. Take into account the library staff, faculty (if an educational library), and your users. Different libraries will have different needs, and there are some basic features that should be examined when evaluating any federated search product. These features should be looked at from two different perspectives—that of the librarian-administrator and that of the end-user.

Managing the tool

Generally, federated search products include an administration module, which is where the library can customize the product, add or delete connectors, brand the tool, and more. Bear in mind the ease of use of this module. Some products offer the option to have the vendor make such changes and may not feature the administration module. Think about how you envision your library using federated search and the benefit or cost of not being able to do your own customization.

Many products allow you to create predefined subject categories, so the library can group resources by subject. Users can then select the category and search all resources rather than searching everything or having to take the additional step to determine exactly which individual databases and resources fit their search needs. Many products also offer the information icon, which users can click on to get descriptions of a database.

Consider the visual customizations available through the administration interface. Can you make the tool appear as a very simple one-box “Googleized” search, or does it look like a database? Find out what your branding options are, to connect the product with your library and your other resources. How easy is it to do the branding, and how intuitive?

Usage statistics give concrete answers about what your users are doing with a federated search product once it’s installed. What kind of usage statistics does the product offer, such as sessions, search terms, etc., and how are they provided? Can you access them from the administration module on-demand, or must they be obtained from the vendor? There are a number of standards available for reporting and working with these statistics, so make sure to ask if the product is compliant with protocols like COUNTER (Counting Online Usage of Networked Electronic Resources) and SUSHI (Standardized Usage Statistics Harvesting Institute).

The user experience

With any federated search product, it’s important that a user is able to log on and immediately know what to do. Consider the interface from the user’s perspective, and determine whether the basic functions of the tool are well designed or hidden and clumsy. Make sure to figure out how the user can narrow a search. What fields can be searched? Commonly sought search fields include subject/keyword/descriptor, title, abstract, and author. What limiters can be set? Commonly sought limiters include date range, peer-reviewed, and full text only.

Is there a Search Progress indicator? This can be anything that lets the user know the search is in progress. Federated search tools can return results from dozens of databases that all respond at different speeds, from mere milliseconds up to many seconds, so be sure to think about what the user is doing while this happens—must users wait to view all matches while the search is in progress, or can they start viewing and working with results immediately?

The literal listing and display of the results are also extremely important. Does the product indicate, either in the results list itself or in the side toolbar, in which resource the matches reside? For example, “Academic Search Elite (29)” would indicate 29 results from Academic Search Elite. This nice feature helps users learn which resources commonly have materials in their subject area.

Also, think about how the results are returned to the user, and how they can be sorted and reordered to suit the user’s needs. Talk to the product vendor about how relevancy ranking is applied to the results from disparate databases. Along similar lines, be sure to figure out exactly how the product deals with duplicate hits from different databases. Are duplicate hits indicated, and how are they removed from the final results list? Once the deduced results are delivered, consider how much control the user has over them. Can they be sorted according to any criteria or be manipulated in other ways?

Finally, weigh any bells and whistles. Common additional features offered by vendors vary but may include search alerts, RSS feeds, and the ability to save searches and create citations in a user’s profile.

Where will the tool live?

There are generally two ways to approach licensing a federated search product. One is to go with a hosted solution whereby the vendor hosts the product on its servers, and access is accomplished via IP authentication. If you’re in a streamlined organization and have no in-house technical support, then a hosted application is probably best. The library still has the ability to make changes via the administration module (if available), but maintenance is handled by the vendor.

The second solution is to host the software locally; almost all vendors offer the option to install their software on your library’s server. However, this will require that you have the IT support and knowledge to maintain the product, including installing upgrades. Make sure to calculate these efforts in terms of both time and money as you decide whether to host the product on your own machines or not.

Pricing

Most federated search products are priced by the number of connectors, usually in ranges, e.g., 1–39, 40–60, etc. A connector is any resource you feature in your product, including free resources such as library OPACs, Google Scholar, and MSN Live. We recommend reviewing the resources your library plans to incorporate in a federated search tool before deciding on a product, including any you might plan to buy. Count each resource as a connector and take that number to discussions with the vendor. Also, think about whether you want to include free resources. You may, depending on your user population and how you plan to implement your federated search tool. We are also advocates of opening up other library OPACs, especially if they are local and the library can be accessed by your users, such as other colleges, universities, or public libraries with which your institution reciprocates.

Open-source options

If you have strong IT support, going with an open-source product may be a feasible option. Open source products provide comparable features and zero purchase costs. But without the necessary IT support to install the software and get it up and running, you will most likely have to outsource that aspect of it, which is not free. Yet, in many cases, the cost of the up-front work for installation and setup and then for regular maintenance or upgrades could be cheaper than a subscription-based product. It certainly wouldn’t hurt to run the numbers. To date, the products most widely recognized in this arena are LibraryFind from Oregon State University Libraries, dbWiz from Simon Fraser University, and Pazpar2 from Index Data

When investigating an open-source solution, the key items to analyze are the IT administrative tasks involved. Are regular data dumps and reindexing necessary? What hardware and bandwidth capacities will you need? Will downloading and emailing of results be available, and do you have someone who can set up the needed security profiles on your server to protect your library adequately? All of these questions are part of the larger issue of what happens to the accuracy and timeliness of your search results if no one is available or knows how to keep your search engine running (for more on open source options, see Karen Coombs’s Last Byte column, p. 24).

Organizational considerations

The federated search product itself is only a small part of the equation—probably less than half when judging this offering. The biggest part of the equation is the mindset of your organization, leaving many concerns before shopping. Why a federated search product? Is it because you think you should have one? If it’s because everyone else has one, evaluate if that is the right reason. If it’s because you think it will help your users, then continue. The goal of federated search is to provide tools to aid the user in the discovery of resources.

A good federated search implementation requires staff support, so be sure to keep in mind what your librarians think of federated search. If they love it, great. If they hate it, try to address their concerns. Look at other library implementations and at web-based federated search products. Librarians and other staff must be behind the product; students and faculty may figure it out, but without librarians actively supporting and teaching it, it’s more likely to fail than succeed.

Determine exactly what your IT and technical capabilities are. With a federated search implementation, this is likely to be one of the biggest obstacles. How tech-savvy are your librarians? If you have one or two who don’t mind being hands-on, then go for a product that allows you to do the initial setup and tweaking if needed. If your librarians aren’t tech-savvy or don’t want to take this on, then shop for a product that will do the setup and tweaking for you. Remember, though, there will be a lag time for any changes in this type of arrangement.

Also, think about how a federated search tool will fit into your user instruction and information literacy sessions. The best way to promote the use of this tool and set expectations with users is to incorporate it into instructional sessions with users on a wide scale.

Different approaches

In academia, each database is taught with a special focus on what is appropriate for the discipline of the audience. In public libraries, the philosophy is a little different. The goal is to connect the user with the right information using tools or resources appropriate for the patron. The technological savvy of patrons in our public libraries varies greatly. Those already comfortable with the Internet will start their research with a web search and only resort to a librarian if they aren’t successful. Those uncomfortable with technology look to the librarian to make their task easier. Both categories can be best served with a comprehensive tool such as a federated search engine. Public libraries continually battle to increase their database usage. The largest barriers to this, as expressed by our colleagues, are the overabundance of different interfaces and the difficulty of knowing with which database to even begin.

The Texas State Library and Archives have put their efforts into building a federated search product. It’s available through the Library of Texas (www.libraryoftexas.org) and gives users access to all Z39.50-accessible libraries in Texas as well as the collection of databases made available to all public, academic, and K–12 libraries in the state (these are commonly known as TexShare and the K–12 Databases). In addition to the purpose of a federated search engine—to provide one interface to ILS and database information—the Library of Texas also addressed a key factor for any public library: cost. Although slow to progress with bells and whistles, version 2.0, which includes the long-awaited faceted or categorized results, has been released. Such local and/or consortial efforts, which address user needs while satisfying the cost factor, will bring the federated search to the masses.

Making the right choice

The most common doubts expressed about federated search seem to be that the tools are not as good as the native interfaces and that they can’t replace actually searching individual databases. To us, the federated search is not meant to replace, or even enhance, a database’s native interface.

Ultimately, it’s another research tool. It will work for some and not for others. While as librarians we may have doubts, if federated search works for the users, then it’s worth having. No, the products aren’t perfect, but then few products ever are.

Source: http://www.libraryjournal.com/article/CA6571320.html

Leave a Comment