Dark data: When should it come into the light?

Dark data - one of the latest trending discussions in data and analytics - is currently defined as data that a business creates and saves, but isn’t used to run the business. I’d like to offer up an addition to that definition: Dark data should also include data that a business creates but does not currently save. 

The question many people are asking is: What should be done with dark data? Some say data should never be thrown away, as storage is so cheap, and that data may have a purpose in the future. 

While analytics is what I do, I am an economist by training and at heart.  So, here are my thoughts about Dark data from a business perspective.

  1. Dark data is something businesses need to think about generally and also strategically consider in terms of what to do with it. I would not start by analyzing the data. Instead, I would begin by understanding what type of information could be in the data that could serve a strategic purpose or offer competitive advantage.
  2. Once an understanding of the data is established, consider two issues.  First, whether your organization --  over the course of the next one to five years -- would be likely to find any use for that data. Second, if any other company, individual or entity might find a use for it. You then need to ask how useful they might find it, if they would be willing to pay for it and how much it may be worth to them.
  3. If you identify a use for the data, either for internal purposes or for external opportunities, try to ascertain the value of that use, over a relevant time period.
  4. After understanding the value of the data’s potential uses, start to understand the structure of the data and what it would take in terms of cost and effort to get the data into useable form.
  5. Here is where my economics comes in.  Once the expected value of the data and the costs of bringing the dark data to light are understood, a simple cost-benefit analysis enables a business to make a decision about what to do with their dark data.

So, dark data, unlike dark matter, can be brought to light and so can its potential ROI. And what’s more, a simple way of thinking about what to do with the data –- through a cost-benefit analysis –- can remove the complexity surrounding the previously mysterious dark data. 

It’s also very interesting to think about how to value dark data. 

Years ago, people collected data that they probably thought was not very useful.  For instance, when you contact a call center, you routinely hear that your call “may be recorded for training purposes” – and it’s a training tool that has some value, I am sure.  But, what about taking this a step further and capturing a caller’s voice, not for training, but for analysis, to detect emotions? While a caller is in a queue waiting for someone to answer, and murmuring to themselves, it’s possible to detect their emotional state and to use that analysis to customize treatments and interactions for them. When the service representative finally gets on the call, the outputs from the voice analysis mean they are able to serve the customer better and increase the probability of retaining them. 

Items that used to be dark, like the voice patterns mentioned above, may have significant value when brought to light.  So, when thinking about what potentially valuable dark data may exist in your organization, or even in another organization, it pays to think as broadly as possible. Before conducting a cost-benefit analysis about dark data, think boldly about what can and cannot be done.

FREE Computerworld Insider Guide: Five IT certifications that won’t break you
Join the discussion
Be the first to comment on this article. Our Commenting Policies