Stop drowning your data scientists in drudgery

Many waste time explaining what's already known instead of discovering what's new

fast data center

Can intelligent systems help data scientists? I recognize this is a massive, far-reaching question but it’s one that we need to consider and a question that inspired last week’s gathering of executives from around the world. O’Reilly’s Next: Economy conference, an event dedicated to addressing the transformation of work and business in the digital age, was an opportunity to closely examine the specific business models and industries being rapidly upended.

I was invited to speak on how intelligent systems like natural-language generation will augment our work and which roles will evolve accordingly. My presentation garnered a range of reactions as I zeroed in on the data scientist and pointed out that there are many aspects of this job that can be automated.

The future of data science

I love the idea of the data scientist. There is no question that we need more of them in the private and public domains.  

Companies’ data repositories have been growing at an ever-increasing rate; government entities at every level are gathering performance data to better assess the state of their territories; and finally, the emerging world of the Internet of Things (IoT) promises a flow of data from every object. The day is coming when our cars, our toasters and even our toothbrushes will start producing data.

This is all good. While some may see this as “drowning in data,” I see it as building up a wealth of information that will serve us well in the future. But that leads to the question, who is going to analyze the data and identify what it means for our businesses, our government and our lives?

Businesses and data

A common problem facing businesses is that the IT teams know how to deal with data but are often somewhat distanced from business goals. Likewise, people on the business side know the goals but often lack the technical skills to know what algorithmic approaches make sense for related but different problem sets. So the technologists know how to do it, but not what needs to be done; and the business owners know what needs to be done, but not how to do it.

Which brings us to the current solution: the data scientist. The data scientist, a person who understands both the goals and the technology, can translate business problems into algorithmic solutions. This is why everyone feels like they need to hire an army of them.

But let’s pause for a second and understand what a data scientist truly does.

In a great Quora answer to the question, “What is a Data Scientist?”, Jeffrey Wong, a data scientist at Netflix, said:

A data scientist is usually deeply involved in only a few projects at a time.  We aspire to be very precise and methodical because we typically work on deep problems that require a lot of investigation.

From Jeffrey’s point of view, data scientists are providing the thinking needed to recommend products for Netflix users based on core demographics, location and transactional history.  

Now, while this is a challenge for companies like Netflix, and the kind of problem I’d enjoy solving, it isn’t the kind of problem that most companies face on a day-to-day basis.

Most companies struggle to obtain information like data-driven reports related to core business activities. Like -- what is happening with sales, operations, customers, HR, logistics, etc. They want to know if they are doing better or worse than last year, last month and yesterday. They want to know where there are issues and how well they are doing against competitors. The answers to these questions are in the data at hand and crucial to the way we do business and the decisions we need to make. They are not, however, “deep problems” with regard to the science of data.

With that in mind, let’s consider another aspect of the data scientist's role noted by Jeffrey:

We also write lots of reports, and make lots of graphs, and we do our best to distill complex datasets to actionable insights.  

Reports and graphs.

Does the authoring of reports, charts and graphs sound like a good use of a data scientist's time? Probably not.

There is no question that “reporting” is probably the least interesting aspect of the data scientist’s job because it is the task of explaining what is known rather than discovering what is not.  Unfortunately, it is a task that needs to be done for business owners to understand what the data scientists and analysts have uncovered.

The good news for data scientists is that these reporting responsibilities can be automated.  The results associated with known algorithmic solutions can be mapped onto readable reports simply because we already understand their meaning.  This means that data scientists can be freed from this aspect of their work. And the less time a data scientist has to spend on reporting, the more time he or she can spend on investigating and exploring.

The takeaway here is simple. Figure out what you want from your data before you hire the person who is going to get it for you. Hire a data scientist to explore and investigate. But don’t make someone with extraordinary data and analytic skills do day-to-day reporting. Use the technology of today to do the rote tasks that you might have asked your data scientist to do.


Copyright © 2015 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon