Cerf sees a problem: Today's digital data could be gone tomorrow
A disk with its data may survive, but the ability to understand it may be lost
Computerworld - WASHINGTON -- One of the computer scientists who turned on the Internet in 1983, Vinton Cerf, is concerned that much of the data created since then, and for years still to come, will be lost to time.
Cerf warned that digital things created today -- spreadsheets, documents, presentations as well as mountains of scientific data -- won't be readable in the years and centuries ahead.
Cerf illustrated the problem in a simple way. He runs Microsoft Office 2011 on Macintosh, but it cannot read a 1997 PowerPoint file. "It doesn't know what it is," he said.
"I'm not blaming Microsoft," said Cerf, who is Google's vice president and chief Internet evangelist. "What I'm saying is that backward compatibility is very hard to preserve over very long periods of time."
The data objects are only meaningful if the application software is available to interpret them, Cerf said. "We won't lose the disk, but we may lose the ability to understand the disk."
It's not just PowerPoint slides either, he said. The scientific community collects large amounts of data from simulations and instrument readings. But unless the metadata survives, which will tell under what conditions the data was collected, how the instruments were calibrated, and the correct interpretation of units, the information may be lost.
"If you don't preserve all the extra metadata, you won't know what the data means. So years from now, when you have a new theory, you won't be able to go back and look at the older data," said Cerf, who spoke about the issue at the CW Honors awards program Monday and in an interviewed afterward.
What's needed, Cerf said, is a "digital vellum," a means as durable and long-lasting as the material that has successfully preserved written content for more than 1,000 years.
Ensuring that people in future centuries have access to this data, is "a hard problem," he said.
Cerf and fellow computer scientist Robert Kahn the developed TCP/IP protocol used for the Internet. The full switch over to that protocol occurred in 1983.
If a company goes out of business and there is no provision for its software to become accessible to others, all the products running that software may become inaccessible, Cerf said. "There are hard, complicated technical and legal problems that will have to be resolved."
The problem is recognized and there are efforts internationally to address it. Cerf said he's been in meetings about this issue attended by 400 people.
"It may be that the cloud computing environment will help a lot. It may be able to emulate older hardware on which we can run operating systems and applications," he said.
To give the issue context, Cerf talked about the effort it took for historian Doris Kearns Goodwin to produce her book about President Lincoln's administration, Team of Rivals. She went to more than 100 libraries and found the correspondence written by Lincoln's cabinet and used it to reproduce what its members would have been talking about.
For the future, "we need a digital vellum that will preserve not only the bits, but a way of interpreting them as well," Cerf said.
This article, Cerf sees a problem: Today's digital data could be gone tomorrow , was originally published at Computerworld.com.
Patrick Thibodeau covers cloud computing and enterprise applications, outsourcing, government IT policies, data centers and IT workforce issues for Computerworld. Follow Patrick on Twitter at @DCgov or subscribe to Patrick's RSS feed . His e-mail address is email@example.com.
Read more about IT Leadership in Computerworld's IT Leadership Topic Center.
- Best iPhone, iPad Business Apps for 2014
- 14 Tech Conventions You Should Attend in 2014
- 10 Desktop Apps to Power Your Windows PC
- How to Add New Job Skills Without Going Back to School
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- eBook: An Executives' Guide: The Machine of the Future This eBook has been developed to help executives understand the value of M2M and IoT to enterprises. We will include a framework CEOs...
- Quantifying the Return on Investment: The Business Case for Machine-to-Machine Initiatives This paper uses real-world customer results and a set of key metrics to help executives and managers understand the return they should expect...
- Is There Still a Case for PMO? PMOs haven't always worked for all companies all of the time. Smart companies should review their PMO needs and activities through a PMO...
- Consequences of Poorly Performing Software Systems While performance issues are not always avoidable, their effects can be mitigated and often avoided by developing a keen understanding their causes and...
- How to Select the Right IoT Platform We are rapidly entering a world where almost everything will be connected to the cloud and managing these connected things and leveraging the...
- Integrated Infrastructure: Simplify Operations, Speed Deployments and Reduce Costs George Weiss, Gartner Vice President and Analyst, and Praveen Akkiraju, CEO of VCE, provide practical information regarding the various aspects of Integrated Infrastructures... All IT Leadership White Papers | Webcasts