Building a big data ready organization

As we continue to see the rise of “big data”, organisations are faced with a task that they may not have anticipated and may not be prepared for. The amount of data stored by the average non-tech company (banks, insurance firms, hospitals, etc.) years ago was at manageable levels.

There was no real sense of urgency, and there were no issues regarding storing this data or even properly using it. On the opposite end of the spectrum, technologically-driven companies such as Google, Yahoo, and Amazon have been dealing with large amounts of data for years and have incorporated data science as a staple of their organizations. They have even dedicated entire departments to it.

Today, it is not only those tech companies, but also the non-tech industries that are being forced to prepare for big data, and integrate data as a major focus of the organization. This begs the question, how does an organization become big data ready?

Understand the value of big data

One reason some companies have lagged behind and aren’t equipped for big data is that they don’t view data science as a necessity. In other words, they fail to see the value in big data. Many know the importance of data science as it relates to marketing efforts such as segmenting customers, monitoring customers' online behavior, product development, etc. However, companies must start realizing that every element of a business, and all top level executives, have a stake in preparing a company for big data.

Big data analytics comprises forecasting for the future (in all departments), pricing, company website security, risk management, analyzing sales funnels and much more than you might expect. You can take a look at these MongoDB big data use cases to get an even better picture of why big data is so important for any industry and all company departments. Realizing the overall effect that big data has on a company is the first step towards becoming big data ready.

Establish a strong data platform

A company’s big data solutions will all rely on the data platform they implement as explained in this article by GreenPlum. A company’s data platform consists of more than simply housing data. When choosing or developing a data platform, there are several defining characteristics the platform should have, including:

• Processing data from multiple sources: Big data is processed from several environments, not limited to messaging systems and mainframes.
• Supporting quality data and governance of data: As stated earlier, companies are going to rely heavily upon big data so it is very important that they can trust the information being stored.
• Prioritizing data: The practice of integrating a company’s master data management system with big data, in order to determine what data being processed is relevant to a company’s needs, is an emerging trend. As big data continues to grow, the need to separate the useful data from useless data will become an even bigger requirement of data platforms.

Make the switch to NoSQL databases

Without going into the heated debate of relational vs. non-relational databases, many CIOs are starting to accept that NoSQL is the best way to handle big data. Big data does not simply refer to the volume of data being stored today, but also the velocity with which it becomes available. NoSQL databases are simply better equipped to handle this change than traditional relational databases.

NoSQL databases such as MongoDB are designed with dynamic schemas in order to adjust to the constant changes that big data creates for databases. In addition, NoSQL provides the scalability needed for companies getting ready for big data. Relational databases were not designed with scalability as a core focus, so expansion and growth require companies to invest in new infrastructure and resources. NoSQL on the other hand is designed to scale outwards, not vertically.

Companies need to start getting ready for big data, and structuring their business models to incorporate data science into every department. It’s beginning to become clear that the companies that adapt these principles will flourish, and those that don’t will be at a significant disadvantage.

Matt Asay is vice-president of corporate strategy at 10gen, the company behind MongoDB NoSQL database. With more than a decade spent in open source, Matt is a recognized open source advocate and board member emeritus of the Open Source Initiative (OSI).

Copyright © 2013 IDG Communications, Inc.

8 simple ways to clean data with Excel
Shop Tech Products at Amazon