Scott Hicar has a big-data problem. Every year, the DigitalGlobe CIO must find room for another petabyte or two of the imagery that streams from his company's three Earth-imaging satellites. And the scope of that task is going to grow: A fourth satellite is under construction. Processing that data is a massive challenge, as is the task of delivering ever larger volumes to customers, who use DigitalGlobe images for everything from assessing storm damage to providing location-based mobile services. But it's worth the effort: DigitalGlobe helped emergency teams take stock of the devastation in Japan after the April 2011 earthquake and tsunami -- an undertaking that earned the company recognition as a 2012 Computerworld Honors laureate. Here, Hicar talks about how big data is becoming an "immovable object" and why DigitalGlobe is betting on the cloud.
What challenges did you face in processing and distributing data to Japanese emergency responders after the earthquake and tsunami? We worked with Hitachi Defense and targeted our satellites. First responders could see an impact assessment within two hours of the event so they could start to plan. Two years ago, that capability was possible [only] for special circumstances like that event; otherwise, our average load times were in the 12-hour time frame. Now we can take an image from a satellite, downlink it to a ground station, get it back here to Colorado, process it and have it available online in our cloud in under two hours, on average.
What role does IT play in DigitalGlobe's business model? As the leading provider of Earth imagery content and analysis, we are a digital company creating digital products, so IT has a very broad role. We have a network of ground terminals around the world that take information from our satellites, and we provide the telecommunications infrastructure to get that massive amount of content here to our processing facility. We receive 2 to 2.5 petabytes of raw imagery a year from our satellites, we have about 8 petabytes of fresh content on spinning disk, and we have an archive of over 2 million square kilometers of imagery that represents seven or eight years of active collections of the planet.
We decompress that, process it, and ship out 15 to 20 petabytes of imagery per year to our customers. Most recently, we're putting it into our geospatial cloud to make it available on a much more real-time basis.
What's involved in processing that data? We stretch the images across an elevation model of the Earth to make sure that they're accurate. We cancel out the angle of the camera that the image was taken from; we cancel out the bumpiness of the Earth; we correct for the wobble associated with the gravitational pull of the moon and what that pull means to the tides. All of those things need to be corrected. It's a massive computational problem to create accurate imagery.
What role does analytics play in your business and IT operations? We use analytics to keep our images current, to determine where we need to refresh next. Cities with a high rate of change are high on this list because our customers use our imagery to reset car navigation systems.
What unique challenges does IT face in meeting the needs of your industry and business? It's the big-data problem. We process massive amounts of imagery, and we need to meet very exact timelines. So the challenge is how to create a flexible computing model that lets us create the right product for the customer. We created our own high-performance compute cluster, and we are actively harnessing the power of graphics processing units to do this. By moving processing from CPUs to GPUs, we have seen anywhere from a 10- to 20-times improvement in speed.
Today we have somewhere around 20 petabytes of data between spinning disk and tape. In the future, customers will no longer be able to [receive] all of the data they need to answer a problem. The data is becoming an immovable object. So we're going to create in our cloud an opportunity for customers to bring us their computations, which we can then run on our high-performance computing environment. It will be easier to bring the computations to the data than to bring the data to the computations.
Can you discuss some of the mobile projects you're exploring? We have a customer that provides location-based services through an online navigation portal, and they want to make our images accessible to their mobile customers. If you're looking at imagery through a phone or online, it comes back in little squares, or tiles. To create this ability to tile out the entire globe and serve it to mobile customers requires somewhere in the neighborhood of 42 billion tiles. We put that into our high-performance computing cluster, and we are in the process of finishing up delivery of those 42 billion tiles.
What do you find most frustrating about your industry? Our information, our content and our analytics can solve business problems that are very broad-based across many industries. But other corporations don't understand how much value can be generated from satellite imagery, how many problems can be solved in a way that's more cost-effective than a traditional solution for gathering information. We're watching and waiting for that to emerge. We see leading innovators using these solutions. But we're not yet in what I'd call the middle of the market in some of these areas.