Start-up Cloudera is introducing a set of applications on Friday for working with Hadoop, the open-source framework for large-scale data processing and analysis.
Cloudera, which provides Hadoop support to enterprises, developed the new browser-based application suite to simplify the process of using Hadoop, according to CEO Mike Olson.
"It's an easy-to-use GUI suitable for people who don't have a lot of Hadoop expertise," Olson said. "The big Web properties with sophisticated and talented PhDs have been successful [with it], but ordinary IT shops ... have had a harder time."
Hadoop is known for its behind-the-scenes role crunching oceans of information for Web operations like Facebook and Yahoo. It allows an application workload to be spread over clusters of commodity hardware, and also includes a distributed file system.
But although the technology is "at its best" when data volumes get into multiple terabytes, Hadoop has relevance for a wide variety of companies, according to Olson. "It's increasingly easy to get your hands on that much data these days," especially from machine-generated information like Web logs, he said.
The browser-based application set is supported on Windows, Mac and Linux, and includes four modules: a file browser; a tool for creating, executing and archiving jobs; a tool for monitoring the status of jobs; and a "cluster health dashboard" for keeping tabs on a cluster's performance.
Cloudera and its partners are fine-tuning the suite, which is now in beta, before issuing a general release.
Hadoop needs many more tools like it, according to analyst Curt Monash of Monash Research.
"If Hadoop is to consistently handle workloads as diverse and demanding as those of [massively parallel processing] relational
DBMSes, it needs a lot of tools and infrastructure," Monash said via e-mail. "The three leaders in developing those are Yahoo, Cloudera and Facebook. There's a long way to go."