Skip the navigation

Narrowing the search

Companies are looking for search products tailored to their needs

By Sue Hildreth
January 24, 2005 12:00 PM ET

Computerworld - To most end users and even many IT managers, the term enterprise search implies little more than a keyword text box on the company Web site, as opposed to one on the public Internet. Five years ago, that was basically correct. Today, however, search is a lot more than keywords.
"There's a whole spectrum of technologies that fall under the big umbrella called 'search,' " says Hadley Reynolds, a research analyst at Delphi Group. "They range from simple keyword search to taxonomy classification and categorization to text analytics."
For instance, researchers at the Stanford Linear Accelerator Center (SLAC) in Menlo Park, Calif., recently needed a search tool to help them index and navigate an internal newsgroup with 600-plus posts per day. They needed a tool that was customizable and capable of handling the large volume of posted messages. After evaluating a number of commercial and open-source search products, SLAC chose the open-source Swish-e tool, both for its speed and low cost.
"It doesn't do all that Google does ... but it turned out to be the fastest index engine," says Douglas Smith, an experimental support professional at SLAC. Smith notes that internal search requires different capabilities than public Internet search, where users don't know anything about the content they're searching.
"For indexing libraries, catalogs, help texts, source code repositories, newsgroups, etc., where the source is known, you want to rank things by the content rather than by comparison of the source links. And that's what Swish-e offers," Smith explains.
SLAC's use of Swish-e is a basic application of search technology in the enterprise. Higher-end search tools, however, offer a more diverse range of features and functions.
Enterprise search applications all start with the ability to search unstructured content, such as PDF files, Word documents, Web pages and other information not contained in a relational database. They include a search engine and are able to rank results by relevance. And most also provide a way to customize both the results ranking and the indexing process, enabling organizations to place greater weight on characteristics of importance to them, such as the source or type of content.
SLAC also uses Verity Inc.'s Ultraseek search engine to index and provide access to SLAC's 500,000 pages of research, administrative information and other HTML and PDF content. The ability to customize the rules for indexing content sped up the process, says Web information manager Ruth McDunn. "It used to take a month to update the collection. Now we can update every day or, at most, every week," she says.
Ultraseek, like many enterprise search



Our Commenting Policies
Internet of Things: Get the latest!
Internet of Things

Our new bimonthly Internet of Things newsletter helps you keep pace with the rapidly evolving technologies, trends and developments related to the IoT. Subscribe now and stay up to date!