Big data is nothing new to Quicken Loans. The nation's largest online retail mortgage lender is accustomed to storing and analyzing data from more than 1.5 million clients and home loans valued at $70 billion in 2012.
But the big data landscape got a little more interesting for the Detroit-based company about three years ago.
"We were starting to focus on big data derived from social media -- Twitter, Facebook, Web tracking, Web chats" -- a massive amount of unstructured data, explains CIO Linglong He. "How to store that data is important because it has an impact on strategy -- not just in storage and architecture strategy, but how to synchronize [that with structured data] and make it more impactful for the company," she says.
Quicken Loans already had a scale-out strategy using a centralized storage area network to manage growth. But it needed more for big data storage -- not just scalable storage space, but compute power close to where the data resides. The solution: scale-out nodes on a Hadoop framework.
"We can leverage the individual nodes, servers, CPU, memory and RAM, so it's very fast for computations," He says, "and from cost, performance and growth standpoints, it is much more impactful for us."
Move over, storage giants, and make way for the new paradigm in enterprise big data storage -- where storage is cheaper and computing power and storage power go hand in hand.
Data at Warp Speed
When it comes to big data, "storage is no longer considered to be a monolithic silo that's proprietary and closed in nature," says Ashish Nadkarni, an analyst at IDC. "A lot of these storage systems are now being deployed using servers with internal drives. It's almost like Facebook or Google models where storage is deployed using internal drives in servers. Some servers have up to 48 drives in them, and the storage platform itself is all software-driven. It's all done using general-purpose operating systems with a software core written on top of it."
Indeed, in the era of big data, companies are gathering information at warp speed and traditional storage strategies can't keep up.
Stored data is growing at 35% per year, according to Boston-based Aberdeen Group. That means IT departments have to double their storage capacity every 24 to 30 months. "Today, an average of 13% of [the money in] IT budgets [is] being spent on storage," says Aberdeen analyst Dick Csaplar. "Two and a half years from now, it will be 26%, and then 52%. Pretty soon, this ratchets out of control, so you can't keep doing the same things over and over." And while it's true that storage costs are declining, he contends that they're not decreasing quickly enough to offset the need to spend more on storage as the amount of data grows.
The deluge of unstructured data continues to grow as well. "The tough challenge, which everyone is trying to solve, is unstructured data that's coming off documents that you wouldn't have expected to have to mine for information," says Vince Campisi, CIO at GE Software, a unit launched in 2011 that connects machines, big data and people to facilitate data analysis. "The traditional BI principles in concept and form still hold true, but the intensity of how much information is coming at you is much higher than the daily transactions in systems running your business."