Skip the navigation

FAQ: Why is enterprise search harder than Google Web search?

Where format complications meet inflated user expectations

By Eric Lai
January 11, 2008 12:00 PM ET

Computerworld - More than a few eyebrows were raised in early January when Microsoft Corp. said it would spend $1.2 billion in cash to buy enterprise search provider Fast Search & Transfer ASA. But Jeffrey Raikes, then Microsoft Business Division president, had to also go and claim that FAST is better than Google Inc. when it comes to searching "behind the firewall."

Computerworld decided to investigate that bold claim and to answer all of the other questions that have popped up in our brains since the challenge.

What exactly does enterprise search do?

Enterprise search software helps company employees find information stored in their corporate networks and PCs in whatever form it's in -- documents, e-mails, spreadsheets, internal Web pages and so forth. Imagine something like Google Desktop or Windows Desktop Search, but indexing an entire company's worth of content.

Large relational database vendors have long argued that stuffing as many of your documents as possible into a database is the way to go. Hence, the ongoing war of words between Oracle Corp. and IBM over whose database software provides faster storage and retrieval of XML data.

But enterprise search software such as Fast, Autonomy Corp. or Endeca Technologies Inc. lets you go the other way and search for information in a database, either in unstructured binary large object or "Blob" form, or if it's numbers, even in cells.

Search software is actually faster than executing a SQL run to find data in a database, though it can't manipulate or numerically analyze the data, according to Yves Schabes, co-founder and president of Teragram Corp., a Cambridge, Mass.-based enterprise search vendor.

If I can use Google, can I easily learn to use enterprise search software?

Probably. Most software today displays a single initial box into which a user can enter keywords separated by Boolean logic commands such as AND and OR. After getting a set of results, users then look to the side for drop-down menus where they can narrow the search down by what Schabes calls "facets" such as information source, by country or by date.

What kinds of information are does search software have difficulty finding?

Enterprise search software tends to be bad at searching information that has already been offloaded to tape archives, according to Schabes. For that, companies still tend to rely on specialized e-discovery and storage management tools.

Enterprise search also has problems handling multimedia such as podcasts, pictures and video files. Metadata is usually scarce or not useful. Those files still need to be transcribed or processed by speech-to-text software to be indexable by enterprise search software.

In addition, enterprise search software isn't good at filtering out multiple versions of the same document, Schabes says. This data cleansing, data de-duplication or master data management is already an established field in the structured relational database realm. But tools are slow to emerge in the unstructured enterprise search arena, he says.

Our Commenting Policies
Internet of Things: Get the latest!
Internet of Things

Our new bimonthly Internet of Things newsletter helps you keep pace with the rapidly evolving technologies, trends and developments related to the IoT. Subscribe now and stay up to date!