A white paper released today from IDC revised the research firm's earlier estimates to show that by 2011, the amount of electronic data created and stored will grow to 10 times the 180 exabytes that existed in 2006, reflecting a compound annual growth rate of almost 60%.
By 2011, there will be 1,800 exabytes of electronic data in existence, or 1.8 zettabytes (an exabyte is equal to 1 billion gigabytes). In fact, the number of bits stored already exceeds the estimated number of stars in the universe, IDC stated. And because data is growing by a factor of 10 every five years, by 2023 the number of stored bits will surpass Avogadro's number, which is the number of carbon atoms in 12 grams, or 602,200,000,000,000,000,000,000 (6.022 x 10^23).
See also: "From bits to yottabytes. How much data is that?"
The study, "The Diverse and Exploding Digital Universe," also found that the rate at which electronic containers for that data — files, images, packets and tag contents — is growing 50% faster than the data itself. And in the year 2011, data will be contained in more than 20 quadrillion — 20 million billion — of those containers, creating a tremendous management challenge for both businesses and consumers.
The study is an updated forecast to a white paper released last March.
Along with the new white paper, which was sponsored by EMC Corp., IDC created a "Personal Digital Footprint Calculator," which resides on the same Web page as EMC's worldwide digital growth-tracking ticker, which counts the amount of data created second by second.
"Obviously, none of us will relate to a number with 21 zeros at the end. The way that the ticker is practically blurred at the far end we hope will give people an idea of the pace — of how fast this is happening. In a way, this is the message of the study," John F. Gantz, IDC's chief research officer, said in a recorded interview.
IDC also acknowledged that it underestimated earlier digital data figures for 2007, saying the actual amount of data — 281 exabytes — is 10% greater than it had previously forecasted in the first "Digital Universe" study (see "A zettabyte by 2010: Corporate data grows fiftyfold in three years"). IDC said the bigger numbers were the result of faster growth in digital cameras and televisions, as well as a better understanding of data replication.
The diversity of the digital data that's producing massive growth runs the gamut, from 6GB movies on DVD to 128-bit signals from RFID tags.
Less than half of the digital data being created by individuals can be accounted for by user activities, such as photos, phone calls and e-mails. The majority of the data is made up by what IDC calls digital "shadows" — including surveillance photos, Web search histories, financial transaction journals and mailing lists.
Gantz said the impact of the tremendous data growth — even if the data is not being created within the company's four walls — will directly affect CIOs. A CIO in a data center that has been crunching numbers in IBM format for 40 years "may be surprised what will happen" when all of a sudden all the voice calls in the enterprise are stored on disk and all the video cameras that used to be closed-circuit TVs are now digital surveillance cameras forcing you to deal with video files, Gantz said.
Also, much of the data being created by consumers outside of an enterprise's four walls will still force that company to protect the data. "At some point in the life of every file, or bit or packet, 85% of that information somewhere goes through a corporate computer, Web site, network or asset," Gantz said. "The corporation at that point in time has responsibility for that information. This is something that can be lost on the CIOs of the enterprises."
"It's not until Viacom sues Google for a billion dollars for the information that its customers are putting on these basically rented electronic storage space[s], and all of a sudden Google has got to fight a lawsuit," Gantz said. "Whether they win or lose, it's hundreds of millions of dollars in lawyer fees. So the enterprise is picking up responsibility for information that's not actually created at the hub of the enterprise."