Open source enables high-volume searches
As data volume explodes, open-source search applications make a move toward the enterprise.
Computerworld - Twitter, Facebook, the Library of Congress -- all of these institutions have mind-numbing amounts of structured and unstructured data that must be indexed and searched quickly. In Twitter's case, that's about 300 million new pieces of information to index every day.
So it's not surprising that such institutions would venture into the seemingly untamed world of open-source search applications, not just for the cost savings, but also for the ability to customize and modify applications quickly. Plus, open source has an active community that can help solve related problems.
But what about other enterprise users? Some 80% of the information in the typical enterprise is now unstructured, including texts, emails, blogs and videos, and that percentage is rising, according to Gartner. All of this data potentially holds value, and today every website is expected to query and produce relevant results as fast as the best Internet search engines. "People need search technology [in] virtually everything they do today. Everybody thinks search [capability] is going to be embedded in everything," says Whit Andrews, an analyst at Gartner.
Right now, most organizations have very constrained search capabilities, which are usually based on SQL queries or specific forms or reports. "That paradigm is soon going to break because the amount of data is just too big, and it's happening much too quickly in a 24/7 environment," he adds.
Big-time Search
Enterprises of all types are starting to explore open-source search applications to get a glimpse into their collections of structured and unstructured data. One such product is Lucene Solr, an open-source search platform developed by Lucid Imagination, a San Mateo, Calif.-based software company.
Interest in open-source search applications began to take off three years ago. "That's when we saw creation of Lucid Imagination, which formed as a commercial support resource" for open-source software, says Greg Olson, senior director at Olliance Group, an open-source consulting firm and a unit of Black Duck Software. "That's a good indicator of mainstream demand for services or a solution around a raw technology like Lucene."
Make no mistake -- Lucene is for heavy hitters of search, Andrews says. "Lucene matters for people who need a very sophisticated search offering or product. Its typical [user] is a vendor that needs enormous scale in its application of technologies. It's a great place to use Lucene -- you need to be able to search a bazillion things. Where you don't see Lucene used is when an intranet needs a search by next Thursday."
A few other players offer lighter-weight search tools based on the same Lucene open-source technology. For instance, online retailer Zappos.com uses Lucene Solr to power its 63 million customer inquiries each month. But internally, the company deploys open-source search engine Elasticsearch, for "non-website-critical systems or non-performance-bound types of services," says Aye Thu, search team lead.
- 10 Hot Big Data Startups to Watch
- 11 Unique Uses for Google Glass, Demonstrated by Celebs
- How to Export Your Google Reader Account
- How to Better Engage Millennials (and Why They Aren't Really so Different)
- Telltale signs of ATM skimming
- 20 security and privacy apps for Androids and iPhones
- Big screen con artists: 7 great movies about social engineering
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- How Application Aware Networks Make the Impossible Possible Realizing Business Value and ROI with Application-Aware Network Performance Management
- Enabling Ubiquitous Visibility in Virtualized Environments Enterprises are rapidly adopting virtualization for dynamic service delivery and service management agility. IT challenges already exist in virtual environments and will only...
- The Importance of Performance Management in Software-defined Networking Riverbed Technology and VMware have joined forces to help address these problems and make it easy to deploy and manage VXLAN overlay networks...
- Network Monitoring and Troubleshooting for Dummies The Network Monitoring and Troubleshooting for Dummies Book introduces you to common network performance management (NPM) issues and give you a new way...
- Live Webcast
Virtustream (Vayence) video taking a 3000-Seat SAP Environment to the Cloud - How can public cloud services help your organization reduce costs and increase security for your mission
- Virtustream (Vayence) video taking a 3000-Seat SAP Environment to the Cloud How can public cloud services help your organization reduce costs and increase security for your mission
- Innovation in the Cloud Managing HR and financial information in the modern business requires efficient business practices and technology. All Applications White Papers | Webcasts