Hadoop updates search with MapR, Cloudera releases
MapR uses the LucidWorks Search while Cloudera releases its SQL-compliant Impala
IDG News Service - Users of the Hadoop data processing platform now have two more search engines to help them sort through their mountains of information.
Hadoop distributor MapR has integrated the LucidWorks Search into its own distribution. And Cloudera has launched the first full release of its open source Impala SQL search engine for Hadoop.
"Using search as the user interface for big data is very interesting. Search is well suited to leveraging a lot of different types of information, especially unstructured information," said Jack Norris, chief marketing officer for MapR. "We're seeing some really interesting applications with search engines at their core, even if a typical user would not think of them as search engine driven."
LucidWorks Search is the commercial version of the open source Apache Lucene/Solr full-text search engine. With the new MapR integration, LucidWorks Search can search through either data on the Hadoop File Systems (HDFS) or on files on other file systems.
LucidWorks Search offers snapshots and mirrors for high availability, and eliminates much of the work required to install Lucene/Solr from scratch. It also offers native support for more data sources, a graphical user interface and a security framework.
The search engine could be used in a dynamic Web application to quickly retrieve photos, advertising, product recommendations, and other information that can be used to populate Web sites on the fly. "This isn't a lower cost substitute for data warehouses. This is about leveraging new data sources and doing some things that have a dramatic impact on the business," Norris said.
MapR and LucidWorks have been working together on pairing their technologies since 2011, when they formed a joint marketing agreement. Earlier this year, they released a connector that makes it easy to use Lucene/Solr with the MapR Hadoop distribution.
LucidWords Search works with the MapR's newly released M7 distribution, in beta form. In addition to supporting LucidWorks Search, the M7 edition has been re-architected to eliminate compactions or background consistency checks, speeding performance.
Also this week, Cloudera released version 1.0 of Cloudera Impala, an open source SQL-compliant query engine for Hadoop. SQL is the database interface language used in relational database management systems (RDMS) and is well-known by database administrators.
Impala was designed to execute queries faster than Hadoop's Hive, because it doesn't use the MapReduce framework, which requires search results to be written to disk. Instead, users can query data stored in HDFS and HBase directly. Users can query data either interactively or through batch processes.
Cloudera first released a version of this engine last October as a beta. Since then, the software has been tested by companies such as 37signals and Expedia.
Impala is the core component of the Cloudera Enterprise RTQ (Real-Time Query) supplemental package for the Cloudera Hadoop platform. Impala can be downloaded at no cost.
- Best iPhone, iPad Business Apps for 2014
- 14 Tech Conventions You Should Attend in 2014
- 10 Desktop Apps to Power Your Windows PC
- How to Add New Job Skills Without Going Back to School
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- Is Your Big Data Solution Production-Ready? Read "Is Your Big Data Solution Production-Ready?" now, and discover best practices and actionable steps to implementing a production-ready big data solution.
- Pay-as-you-Grow Data Protection: IBM Tivoli's Full-featured Data Protection Suite for Small to Medium Businesses IBM Tivoli Storage Manager Suite for Unified Recovery gives small and medium businesses the opportunity to start out with only the individual solutions...
- Simplify and Consolidate Data Protection for Better Business Results Learn about IBM® Tivoli® Storage Manager Operations Center, which provides advanced visualization, built-in analytics and integrated workflow automation features that leapfrog traditional backup...
- Smarter Environmental Analytics Solutions: Offshore Oil and Gas Installations Example This IBM Redbooks® Solution Guide describes a solution for implementing smarter environmental monitoring and analytics for oil and gas industries. The solution implements...
- Webinar: Building a Big Data solution that's production-ready Big data solutions are no longer just a nice-to-have.
- Meg Whitman presents Unlocking IT with Big Data During this Web Event you will hear Meg Whitman, President and CEO, HP discuss HAVEn - the #1 Big Data platform, as well... All Big Data White Papers | Webcasts