Skip the navigation
)
News

Big data has potential but requires care

Both data and tools to manage it are growing, but taking advantage of it requires planning

By Stephen Lawson
December 9, 2011 08:03 PM ET

IDG News Service - The proliferation of large-scale data sets is just beginning to change business and science around the world, but enterprises need to prepare in order to gain the most advantage from their information, panelists said at a Silicon Valley event this week.

So-called "big data" is both a challenge to manage and a tool for competitive advantage, according to speakers at a Churchill Club event on Wednesday night in Mountain View, California. The discussion at the Computer History Museum followed the launch of EMC Greenplum's Unified Analytics Platform, which lets business and IT staffs analyze both structured and unstructured data.

New networked devices and applications are collecting more data than ever and more organizations are holding on to it, creating huge demands for storage. In the second quarter of this year, storage companies shipped 5,429 petabytes of disk capacity, up 30.7 percent from last year's second quarter, IDC reported last week.

 

"Data growth is already faster than both Moore's Law and ... network growth," said Anand Rajaraman, senior vice president of Walmart Global E-Commerce and head of @WalmartLabs. His lab has developed tools for Walmart to take advantage of the new types of data being generated, including applications that collect and analyze information from sources such as Twitter and Facebook to gauge trends and individual consumer preferences.

The benefits of big data stretch beyond business to earth sciences, biology, psychology and other fields, Rajaraman said.

"Science has become more and more about collecting large amounts of data and doing analysis," he said.

Big data can be any volume of data that requires new tools to analyze, said Luke Lonergan, chief technology officer and co-founder of Greenplum, which EMC acquired last year. For example, it would take 27 hours to run a logistic regression algorithm, which can be used to predict the probability of an event, on 30G bytes of data, Lonergan said. If run on 32 computers, the process takes 60 seconds, he said.

"'Bigger than previous-generation, non-parallel infrastructure could handle' might be a useful definition. Anything that blows you out of the old way of doing things," Lonergan said.

Analyzing data also has gotten harder not only because there is more of it but because it comes from new sources, panelists said. Blogs, Web comments and other information comes in the form of unstructured data, which can't be crunched the way relational databases are. The need to mine different types of content has led to new data analysis platforms, most notably the open-source Hadoop framework that was pioneered by Google and Facebook.

The market for new tools to manage and exploit big data is still growing, said Ping Li, who heads the Big Data Fund at venture capital company Accel Partners.

Reprinted with permission from IDG.net. Story copyright 2012 International Data Group. All rights reserved.
What is Tech Briefcase?
TechBriefcase is a new, free service where IT Professionals can Search, Store and Share IT white papers and content like this. Learn more
Bookmark content
Speed up your research efforts with content across the web.
Search and Store
Find the white papers you need. Create folders for any topic.
View Anywhere
Open your briefcase on your iPhone, tablet or desktop. Share with colleagues.
Don't have an account yet?
Additional Resources
Security KnowledgeVault
WHITE PAPER
Security is not an option. This KnowledgeVault Series offers professional advice how to be proactive in the fight against cybercrimes and multi-layered security threats; how to adopt a holistic approach to protecting and managing data; and how to hire a qualified security assessor. Make security your Number 1 priority.

Read now.

Cut Communications Costs Once and for All
WHITE PAPER
New IP-based communications systems are being deployed by small and midsized businesses at a rapid rate. Learn how these organizations are enabling faster responsiveness, creating better customer experiences, speeding office or mobile interactions, and dramatically reducing existing communications costs.

Read now.

Applications White Papers
Establishing a Strategy for Database Security is No Longer Optional
The options for securing increasingly valuable databases are very broad and deep, and can be confusing. This research provides an overview of three...
Driving Secure Enterprise File Sharing and Syncing in the Enterprise
GroupLogic's new activEcho is the industry's only secure Enterprise File Sharing and Synching solution that balances the need for simplicity for the end...
The Enterprise File Sharing Option
Enterprises and IT departments need to address several critical security issues when considering file sharing and syncing products. Many of today's solutions do...
Activities Streams Base An Integrated Social Layer
The enterprise social software market is exploding thanks to converging trends of consumerization, cloud, and mobile. In this must-read report, "The Forrester Wave:...
Converged Infrastructure for Dummies
As you know, everything is mobile, connected, interactive, and immediate. This is exactly why organizations need a highly agile IT infrastructure in order...
All Applications White Papers
Applications Webcasts
Delivery Management -- Extending Lifecycle Management
Date: Wednesday, June 20, 2012, 1:00 PM EDT

Siloed organizations continue doing the wrong things and doing things wrong, leading to increased costs,...
Leverage automation today to reduce IT complexity
Date: Tuesday, June 5, 2012, 2:00 PM EDT

Whether your B2B complexity is caused by multiple technologies due to M&A, business or application specific...
BMC Control-M - Single Point of Control Demo
With BMC Control-M, you schedule and manage everything - down to the very last platform and application - from one simple interface. It's...
Operational Analytics - Changing the Competitive Dynamics of the Business
Date/Time: June 5, 2012, 11:00 a.m., EDT, 4:00 p.m. BST / 3:00 p.m. UTC

Please join us for this webcast, as Dr. Barry...
Oracle Database Appliance Best Practices
Business users increasingly demand 24x7 availability of their data while IT departments face the challenge of ensuring maximum availability while operating with limited...
All Applications Webcasts
Newsletter Sign-Up

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all newsletters | Privacy Policy
IT Jobs