Opinion: Grid an option for data management challenges
Computerworld - With EMC Corp.'s acquisition of Acxiom Corp.'s grid computing software for $30 million last month (see "EMC Partners With Acxiom to Build Grid-based BI Systems"), enterprise customers started opening their eyes to the fact that grid is not just about raw horsepower and CPU utilization for high-performance computing environments.
So what was it that Acxiom did so well with its grid environment that caught EMC's attention? To put it simply: data management.
Acxiom has a very popular data-integration application called AbiliTec. It took the "scale out" commodity hardware route to scale and support a growing number of transactions (as Google Inc. and Amazon.com Inc. have done) and then built its own grid software to manage this new environment. In an article on Acxiom's environment last year, Computerworld reported that its grid had grown to 6,000 Linux nodes, processing more than 50 billion AbiliTec transactions per month (see "Case Study: Acxiom Corp.'s Homegrown Grid").
Performance and reliability have been at the heart of Acxiom's data management grid story, but there are some other very specific enterprise data challenges where grid has already been used in research and science. Today, enterprises are increasingly evaluating the capabilities of grid infrastructure to resolve data management issues ... above and beyond data processing horsepower.
Transporting Massive Amounts of Data
Your typical enterprise is probably not going to be dealing with data on the petabyte (1 quadrillion-byte) level any time soon, like particle physicists in the online science realm do today.
However, many commercial entities do transport enormous files on a daily basis. Consider cases like the British Broadcasting Corp., where one hour of preprocessed high-definition broadcast averages about 280 gigabits. These organizations are working with grid technologies today to make their data assets accessible to field reporters and users across a distributed network.
Moving large data sets at high speeds between distributed sites is a common challenge in many industries. Oil and gas companies are perhaps the poster children for moving large data sets, which they accumulate through seismic analysis and reservoir analysis. Getting the "whole picture" to make sound business decisions requires pulling large quanta of data from many different locations.
Other markets with massive data-transport requirements include the automotive industry (for computer-aided analysis and simulations), semiconductor companies (for mask layout based on instruction sets) and pharmaceutical firms (for molecular matching and chiral synthesis), to name just a few.
Getting Data Out of Complex Storage Systems
Grid pros have popularized the expression that "access to the data is as important as access to compute resources." Sometimes in enterprises, the challenge with data access -- beyond the size of data sets -- is the complexity of the protocols associated with storage systems.
Additional Resources



Learn the important issues you must consider before starting your next mobility initiative. Get your mobility white paper from IDC now, compliments of Sybase.
White Papers & Webcasts
Speeding business innovation with HP Data Center Transformation solutions
Data center transformation enables your IT organization to focus more on business priorities and innovation by decreasing spending on maintenance and management by...
Four Principles for Reducing Storage TCO
(Source: Hitachi Data Systems) Difficult economic times require new strategies for reducing costs. Where storage technology and economics meet, there are...
HP Data Center Transformation Solutions
CIOs today are challenged to respond to economic and business pressures, to change from being cost centers to becoming strategic business enablers. There...
Boost your CAE productivity, and break-away from the pack
(Source: Sun) Join Clemson University as they present their groundbreaking engineering simulations research at their Computational Center for Mobility Systems. Dr. James Leylek,...
Using Symark PowerBroker to Enrich Your Organization's RBAC Model
The essential notion of Role-Based Access Control (RBAC) for IT security administration is establishing permissions based on the functional roles within the enterprise,...
Deduplication and Other Strategies for Protecting Your Assets with the Veritas NetBackup Platform
(Source: Symantec) Many companies find their backup and storage resources strained by data growth and increased regulatory requirements for data retention. In today's...
Using VMware Site Recovery Manager to Simplify DR
(Source: NetApp) Nothing is scarier than the prospect of having to recover an entire site after a disaster. VMware® Site Recovery Manager (SRM)...
Controlling Email and File Server Growth and Costs with Intelligent Archiving
(Source: Symantec) According to IDC 54% of the storage capacity added by organizations in 2008 will be dedicated to the storage of file-based...
NetApp and VMware Virtual Infrastructure 3 Storage Best Practices
(Source: NetApp) NetApp has been providing advanced storage features to VMware ESX solutions since the product began shipping in 2001. During that time,...
Maximize Storage Assets with Thin Provisioning, Tiered Storage, and Cluster File Systems
(Source: Symantec) Thin Provisioning is an opportunity to immediately optimize your storage systems and make more capacity available to your applications. In order...
Subscribe to Computerworld
