The Grill: DigitalGlobe CIO Scott Hicar on the challenges and opportunities of Earth imagery
DigitalGlobe's CIO explains how Earth imagery can solve a multitude of business problems.
Scott Hicar has a big-data problem. Every year, the DigitalGlobe CIO must find room for another petabyte or two of the imagery that streams from his company's three Earth-imaging satellites. And the scope of that task is going to grow: A fourth satellite is under construction. Processing that data is a massive challenge, as is the task of delivering ever larger volumes to customers, who use DigitalGlobe images for everything from assessing storm damage to providing location-based mobile services. But it's worth the effort: DigitalGlobe helped emergency teams take stock of the devastation in Japan after the April 2011 earthquake and tsunami -- an undertaking that earned the company recognition as a 2012 Computerworld Honors laureate. Here, Hicar talks about how big data is becoming an "immovable object" and why DigitalGlobe is betting on the cloud.
What challenges did you face in processing and distributing data to Japanese emergency responders after the earthquake and tsunami? We worked with Hitachi Defense and targeted our satellites. First responders could see an impact assessment within two hours of the event so they could start to plan. Two years ago, that capability was possible [only] for special circumstances like that event; otherwise, our average load times were in the 12-hour time frame. Now we can take an image from a satellite, downlink it to a ground station, get it back here to Colorado, process it and have it available online in our cloud in under two hours, on average.
What role does IT play in DigitalGlobe's business model? As the leading provider of Earth imagery content and analysis, we are a digital company creating digital products, so IT has a very broad role. We have a network of ground terminals around the world that take information from our satellites, and we provide the telecommunications infrastructure to get that massive amount of content here to our processing facility. We receive 2 to 2.5 petabytes of raw imagery a year from our satellites, we have about 8 petabytes of fresh content on spinning disk, and we have an archive of over 2 million square kilometers of imagery that represents seven or eight years of active collections of the planet.
We decompress that, process it, and ship out 15 to 20 petabytes of imagery per year to our customers. Most recently, we're putting it into our geospatial cloud to make it available on a much more real-time basis.
What's involved in processing that data? We stretch the images across an elevation model of the Earth to make sure that they're accurate. We cancel out the angle of the camera that the image was taken from; we cancel out the bumpiness of the Earth; we correct for the wobble associated with the gravitational pull of the moon and what that pull means to the tides. All of those things need to be corrected. It's a massive computational problem to create accurate imagery.
What role does analytics play in your business and IT operations? We use analytics to keep our images current, to determine where we need to refresh next. Cities with a high rate of change are high on this list because our customers use our imagery to reset car navigation systems.
What unique challenges does IT face in meeting the needs of your industry and business? It's the big-data problem. We process massive amounts of imagery, and we need to meet very exact timelines. So the challenge is how to create a flexible computing model that lets us create the right product for the customer. We created our own high-performance compute cluster, and we are actively harnessing the power of graphics processing units to do this. By moving processing from CPUs to GPUs, we have seen anywhere from a 10- to 20-times improvement in speed.
Today we have somewhere around 20 petabytes of data between spinning disk and tape. In the future, customers will no longer be able to [receive] all of the data they need to answer a problem. The data is becoming an immovable object. So we're going to create in our cloud an opportunity for customers to bring us their computations, which we can then run on our high-performance computing environment. It will be easier to bring the computations to the data than to bring the data to the computations.
Can you discuss some of the mobile projects you're exploring? We have a customer that provides location-based services through an online navigation portal, and they want to make our images accessible to their mobile customers. If you're looking at imagery through a phone or online, it comes back in little squares, or tiles. To create this ability to tile out the entire globe and serve it to mobile customers requires somewhere in the neighborhood of 42 billion tiles. We put that into our high-performance computing cluster, and we are in the process of finishing up delivery of those 42 billion tiles.
What do you find most frustrating about your industry? Our information, our content and our analytics can solve business problems that are very broad-based across many industries. But other corporations don't understand how much value can be generated from satellite imagery, how many problems can be solved in a way that's more cost-effective than a traditional solution for gathering information. We're watching and waiting for that to emerge. We see leading innovators using these solutions. But we're not yet in what I'd call the middle of the market in some of these areas.
Read more about Data Center in Computerworld's Data Center Topic Center.
- Hadoop for Dummies Today, organizations in every industry are being showered with imposing quantities of new information. Along with traditional sources, many more data channels and...
- The Top Five Ways to Get Started with Big Data Despite the increased focus on big data over the past few years, most organizations are still talking about what big data is rather...
- Data Warehouse Augmentation: The Queryable Data Store While organizations have, to date, been busy exploring and experimenting, they are now beginning to focus on using big data technologies to solve...
- The IBM Big Data Platform IBM is unique in having developed an enterprise class big data platform that allows you to address the full spectrum of big data...
- Live Webcast Best Practices: How to Improve Business Continuity with Virtualization VMware solutions include a range of business continuity capabilities to help ensure availability for applications across your virtualized environment. Learn More>>
- Endpoint Data Management: Protecting the Perimeter of the Internet of Things Not surprisingly, "Internet of Things" (IoT) and Big Data present new challenges AND opportunities for enterprise IT. Teams need to harness, secure and...
- Best Practices: How to Improve Business Continuity with Virtualization VMware solutions include a range of business continuity capabilities to help ensure availability for applications across your virtualized environment. Learn More>> All Data Center White Papers | Webcasts