Open source enables high-volume searches
As data volume explodes, open-source search applications make a move toward the enterprise.
Computerworld - Twitter, Facebook, the Library of Congress -- all of these institutions have mind-numbing amounts of structured and unstructured data that must be indexed and searched quickly. In Twitter's case, that's about 300 million new pieces of information to index every day.
So it's not surprising that such institutions would venture into the seemingly untamed world of open-source search applications, not just for the cost savings, but also for the ability to customize and modify applications quickly. Plus, open source has an active community that can help solve related problems.
But what about other enterprise users? Some 80% of the information in the typical enterprise is now unstructured, including texts, emails, blogs and videos, and that percentage is rising, according to Gartner. All of this data potentially holds value, and today every website is expected to query and produce relevant results as fast as the best Internet search engines. "People need search technology [in] virtually everything they do today. Everybody thinks search [capability] is going to be embedded in everything," says Whit Andrews, an analyst at Gartner.
Right now, most organizations have very constrained search capabilities, which are usually based on SQL queries or specific forms or reports. "That paradigm is soon going to break because the amount of data is just too big, and it's happening much too quickly in a 24/7 environment," he adds.
Enterprises of all types are starting to explore open-source search applications to get a glimpse into their collections of structured and unstructured data. One such product is Lucene Solr, an open-source search platform developed by Lucid Imagination, a San Mateo, Calif.-based software company.
Interest in open-source search applications began to take off three years ago. "That's when we saw creation of Lucid Imagination, which formed as a commercial support resource" for open-source software, says Greg Olson, senior director at Olliance Group, an open-source consulting firm and a unit of Black Duck Software. "That's a good indicator of mainstream demand for services or a solution around a raw technology like Lucene."
Make no mistake -- Lucene is for heavy hitters of search, Andrews says. "Lucene matters for people who need a very sophisticated search offering or product. Its typical [user] is a vendor that needs enormous scale in its application of technologies. It's a great place to use Lucene -- you need to be able to search a bazillion things. Where you don't see Lucene used is when an intranet needs a search by next Thursday."
A few other players offer lighter-weight search tools based on the same Lucene open-source technology. For instance, online retailer Zappos.com uses Lucene Solr to power its 63 million customer inquiries each month. But internally, the company deploys open-source search engine Elasticsearch, for "non-website-critical systems or non-performance-bound types of services," says Aye Thu, search team lead.
- Path Selection Infographic Path Selection Infographic
- Hyperconvergence Infographic A wide range of observers agree that data centers are now entering an era of "hyperconvergence" that will raise network traffic levels faster...
- Preparing Your Infrastructure for the Hyperconvergence Era From cloud computing and virtualization to mobility and unified communications, an array of innovative technologies is transforming today's data centers.
- Increase IT Performance from the Enterprise to the Cloud with WAN Optimization Massive consolidation and data mobility, enabled by virtualization, have radically altered how we build servers, design applications, and deploy storage for the emerging...
- Live Webcast
Transforming Finance, Procurement and Supply Chain Effectiveness with Cross-Functional Analytics
Date: May 6th, 2014
Time: 1 PM EDT
Attend this Webcast to find out how Oracle's packaged analytic applications enable line-of-business managers to examine all...
- Video Stream Quality Impacts Viewer Behavior This scientific white paper, using statistical data from Amakai's streaming network, analyzes how changes in video quality cause changes in viewer behavior.
- Service-Enabling CICS Applications: Best Practices This informative webcast provides an informed, thorough look into CICS service-enablement options and how they can affect your environment. You'll learn how to... All Applications White Papers | Webcasts