IT Looks for New Tools to Exploit 'Big Data'

Companies want to mine terabytes of user- and machine-generated data to glean details about their customers' habits, activities and whereabouts.

As tools for real-time and batch analysis of so-called big data emerge, IT operations are gaining the ability to track the activities, habits and movements of customers with great precision.

Experts say many businesses want to better analyze data stored in emerging massively parallel databases like the open-source Apache Hadoop framework to learn where their customers are employed, what they do in off-hours and who they spend time with.

The information could help companies tailor Web-based advertising and marketing materials to specific customers.

"[The trend] will change our existing notions of privacy. A surveillance society is not only inevitable, it's irresistible," Jeff Jonas, a distinguished engineer at IBM, said at the Structure Big Data 2011 conference in New York late last month.

The term "big data" refers to the massive amounts of information collected from machine- and human-generated computer system log files, electronic financial transactions, Web search streams, email metadata, search engine queries and social networking activity.

In 2010 alone, 1.5 zettabytes (1.5 billion TB) of such data, mostly machine-generated, were created. Companies collectively filled their data center storage systems with about 16 exabytes (16 million TB) of that data last year, said Jason Hoffman, founder and chief scientist at cloud software provider Joyent.

To continue reading this article register now

Shop Tech Products at Amazon