It is little secret that machine learning, advanced algorithms, artificial intelligence, advances in robotics -- augmented with the proliferation of sensor-based data -- are remaking how businesses operate. As these technologies become more affordable, more ubiquitous and more accurate, companies will find it easier and easier to offload certain tasks -- including knowledge work.
Which is all fine and good, assuming you adhere to the notion that these advances will result in a net increase in jobs and do not signal the start of a massive erosion of middle class employment opportunities. Which is still debatable.
One recent event, however, should give companies pause about relying too much on technology to make decisions, or even to proceed without detailed human oversight: the disgusting debacle that resulted from the rollout of Flickr's new image-recognition technology last month. Shortly after its debut it became clear that something was wrong with the automatic tagging feature because it began generating offensive tags, The Guardian reported, including slapping "animal" and "ape" labels on images of black people, and "jungle gym" and "sport" on images of concentration camps.
Clearly this was not deliberate on Flickr's part. Rather, bias -- on the part of the people using the self-learning app -- is the best guess as to how it happened is. That, and the complexity of the program.
A biased algorithm?
Zeynep Tufekci, an assistant professor at the University of North Carolina at Chapel Hill's School of Information and Library Science, explains this to NPR's Arun Rath
Bias "comes through the complexity of the program and the limits of the data they have. And if there are some imperfections in your data -- and there always [are] -- that's going to be reflected as a bias in your system."
This led Rath to conclude that algorithms are "not like laws of physics or laws of nature. They're created by us. We should look into what they do and not let them do everything. We should make those decisions explicitly."
This is not a new observation, although it is one that people tend to forget in the excitement of new advances.
Kate Crawford, a principal researcher at Microsoft Research and a visiting professor at the MIT Center for Civic Media, was talking about this issue two years ago. She wrote in the Harvard Business Journal:
Data and data sets are not objective; they are creations of human design. We give numbers their voice, draw inferences from them, and define their meaning through our interpretations. Hidden biases in both the collection and analysis stages present considerable risks, and are as important to the big-data equation as the numbers themselves.
As good as they get
Furthermore, it is unlikely that additional advances will be made to refine or tweak this issue away.
According to a new paper recently presented by MIT researchers, the long-established edit distance algorithm -- celebrating its 40th birthday this year -- cannot be improved. Computer scientists have been trying almost from the beginning to upgrade its basic formula, which determines how much two sequences of symbols have in common.
“This edit distance is something that I’ve been trying to get better algorithms for since I was a graduate student, in the mid-’90s,” says Piotr Indyk, a professor of computer science and engineering at MIT and a co-author of the paper presented at the ACM Symposium on Theory of Computing. “I certainly spent lots of late nights on that -- without any progress whatsoever. So at least now there’s a feeling of closure. The problem can be put to sleep.”
Time to step back?
So back to my original thought, or some variation of, rather: Should we trust the latest and greatest of technology with serious decisions?
Perhaps the question is not "should we?" but instead "should we stop?" A new Gartner report with the eye-catching title of "Smarter Machines Will Challenge the Human Desire for Control," urges CIOs to be conservative with using this technology for some interesting reasons.
For starters, Stephen Prentice, vice president and Gartner fellow, appears to have reservations about the huge swath of jobs that will be eliminated by technology.
"As smart machines become more capable, and more affordable, they will be more widely deployed in multiple roles across many industries, replacing some human workers. This is nothing new," he said. "Organizations must balance the necessity to exploit the significant advances being made in the capabilities of various smart machines with the perceived negative impact of resulting job losses."
There is something else, though, that needs to be considered: Even thought technology is gathering and computing the data, the company will be left holding the bag if something goes completely awry.
"In effect, smart machines are now collecting information about practically every facet of human activity on a continual, pervasive and uncontrollable basis, with no option to 'turn off' the activity," Prentice says. "The potential reputational damage arising from uncontrolled and inappropriate data collection is well-established and can be substantial."
In short, imagine a misstep like the one Flickr's technology made -- but on a far grander and more global scale.
This article is published as part of the IDG Contributor Network. Want to Join?