It's been a month now since I've opened the economy size bottle of fish oil capsules that sits gathering dust in a corner of my kitchen. I haven’t dared touch it since reading about a medical study purporting to show that fish oil supplements increase by 70% my risk of contracting a particularly lethal form of prostate cancer. Every year, it seems, researchers trot out new studies, with new revelations about something that’s bad for you. Some studies, such as those condemning hydrogenated oils, stand the test of time, and this one may as well. But many headline-grabbing revelations eventually fade away -- and for good reason.
“The great majority of epidemiological studies that say something is bad for you are actually wrong,” claims Dr. John Elder, CEO at data mining firm Elder Research.
After the sensational headlines die down, researchers discover that the results can’t be reproduced because of what Elder calls “the vast search effect,” a phenomenon better known as the multiple comparison problem. Unfortunately, many researchers fall into this trap, and your business could too.
Here’s how it works: Scientists look at a lot of different inputs, such as what you eat, and outcomes such as different diseases, and create a matrix. For example, if they had 10 foods crossed against 10 diseases they would have 100 cells in a matrix. To publish in the medical community, the researcher needs a 5% significance threshold -- the finding must only have a 5% chance of having been generated purely by chance. But with 100 cells, five of them on average will pass the 5% chance threshold.
“So five papers get published, some to fanfare. But months later duplicate studies by others most often do not confirm the results. I’ve seen studies that say 90% of the pre-clinical research in cancer findings were not reproducible for this reason,” Elder says.
“There’s a lot of junk science being done because of this lack of understanding of how a statistical test meant to evaluate one hypothesis is poor protection against fooling yourself when you compare against many, many hypotheses and cherry pick the best result,” he says.
“The central question in statistics is, how likely is it that was I just fooled?” Elder says. In other words, how likely is it that the result arose by chance? In this case there is a 5% chance that a published medical result is actually random. But the problem is that the researchers didn’t just test one idea. They might have tested 100 ideas, of which five yielded publishable results. “Those won’t hold up with new data because they aren’t real. The problem is that people cast their net wide and try a lot of things.”
Elder offers up another example that shows the dangers of doing the same thing with ratios. You can measure that the highest liver cancer instances are in rural Republican counties and researchers can come up with all sorts of reasons why that is. But they also can measure that the lowest instances are in rural Republican counties.
“The Republican hypothesis is a red herring. The reason has to do with the low population in rural areas,” Elder says. Rural counties tend to be more Republican, but a few extra cases in a small population can make rates look very high or very low. “The denominator is low that so a few random changes make a big difference in the percentages. Don’t get fooled by proportions,” Elder says.
The search effect can also creep into business problems. To which customers do you have the greatest success selling? Analysts may slice up customer data into subgroups and create a lot of little cells, leaving themselves open to the vast search effect. “It’s designed to work for one comparison, but if you do hundreds of them the test breaks down. It’s not the right test to use,” he says.
The solution is to use more sophisticated methods. If you want to do it right, Elder says, get up to speed on a technique called resampling.”