Durkheim Project Leverages Big Data to Prevent Veteran Suicides
CIO - Suicide has grown to epidemic proportions among U.S. veterans of Iraq and Afghanistan, and the Pentagon and U.S. Department of Veterans Affairs is hoping that social media and big data can help them identify at-risk veterans and get them the care they need.
Last year, more active-duty servicemen and servicewomen took their own lives than were killed combat. In February 2013, the Iraq and Afghanistan Veterans of America (IAVA) conducted its 2013 Member Survey of 4,104 veterans of Iraq and Afghanistan.
IAVA reported that 30 percent of respondents had considered taking their own lives, 45 percent said knew an Iraq or Afghanistan veteran who had attempted suicide and 37 percent knew an Iraq or Afghanistan veteran who had committed suicide. And 50 percent of respondents said someone close to them had suggested they seek care for a mental health injury.
Identifying people who are at risk of committing suicide is a tricky thing. Often, the people who are the most in need of help are the least likely to seek it. But Chris Poulin, principal partner of predictive analytics specialist Patterns and Predictions, started wondering: What if he took the tools for event-driven risk analytics used by Wall Street financial firms and applied them to the problem?
Can Facebook and Twitter Activity Predict Suicide Attempts?
The idea was deceptively simple. A veteran in distress may not be able to communicate that distress verbally. But they do frequently reach out via social media: Facebook posts, tweets and so on. If you can model key textual indicators of suicidality and analyze social media streams in real-time, you can potentially identify at-risk individuals and intervene before they harm themselves.
In 2011, with funding from the Defense Advanced Research Agency (DARPA) Poulin set out to determine whether his idea had legs by forming the non-profit Durkheim Project. He brought together a multidisiciplinary team of artificial intelligence (machine learning) and medical experts (psychiatrists) from Dartmouth Engineering, Dartmouth Medical School and the U.S. Veterans Administration dedicated to applied research on predictive suicide risk.
The project was named in honor of sociologist Emile Durkheim, who in 1897 published the paper Suicide, which defined early text analysis for suicide risk and provided a framework of theoretical explanations relating to societal disconnection.
[Related: 6 Practical Predictive Analytics Tools]
Phase 1 of the Durkheim Project's research consisted of building a predictive model. Poulin (who at the time was co-director of the Dartmouth College Metalearning Working Group at Dartmouth Thayer School of Engineering) collaborated with researchers Paul Thompson, Thomas McAllister, MD and Laura Flashman, PhD, from the Geisel School of Medicine at Dartmouth and Brian Shiner, MD and Vince Watts, MD from the U.S. Department of Veterans Affairs.
Using a control group of veterans, the researchers focused on proving that text-mining methods could provide statistically significant predictions of suicidality.
"We needed to show that we have a medically efficacious classifier," Poulin says. "We achieved 65 percent accuracy. We're convinced that's a decent signal. It's not great, but it's consistent and we're going to build on that."
"The study we've begun with our research partners will build a rich knowledge base that eventually could enable timely interventions by mental health professionals," he adds.
Attivio and Cloudera Help Unify and Store Veterans' Social Media Data
"There's plenty of opportunity to use big data to make money," says Attivio CTO Sid Probstein. "Having the opportunity to use it for something like this is just fantastic."
"Say a veteran who returns from a theater is having trouble dealing with things that happened, things they saw," he adds. "As they're in that state, we expect them to voice frustration, and to do so primarily in social media. On Twitter they might quote song lyrics or a poem. There are some common threads to this kind of expression."
"The system is not really trying to understand what the person is saying," he notes. "It really only is looking for patterns and to apply logic to that. It's not understanding that there's some negative expression. It's detecting the likelihood that negative expression is an indicator of someone that's at-risk."
The Durkheim Project has forged partnerships with social media titans like Facebook, Twitter and LinkedIn. Using a suite of applications (available through the social media networks and on iPhone and Android devices), the project is creating a voluntary, opt-in database of participants' social media and mobile phone data that the researchers hope will eventually be used to provide clinicians with real-time assessments of psychological risk factors for suicide and other destructive behaviors.
The applications automatically upload relevant content (from the online activity of veterans who have volunteered to be part of the study) into an integrated medical database. The resulting text repository will be continuously updated and analyzed by machine learning systems to enable real-time monitoring of text content and behavioral patterns statistically correlated with suicidality.
"As we build upon the promising findings of our Phase 1 investigation, the Durkheim team is pleased to have Facebook's partnership in helping us connect with the community of veterans, as Facebook's capability for outreach is unparalleled," Poulin says.
"At Facebook, we have a unique opportunity to provide the right resources to our users in distress, when and where they need them most," adds Joel Kaplan, Facebook's U.S. vice president of Public Policy and a veteran himself.
"We are proud to be partnering with the Department of Veterans Affairs research on the Durkheim Project, so we can bring a better understanding to this important issue and equip those that use our service with even better tools to keep them safe," Kaplan says. "Through a concerted and coordinated effort on the part of private industry, government and concerned family and friends, we believe we can make a real difference in preventing suicide and saving lives."
The database will also incorporate internal and external risk factors-including concussions, post-traumatic stress, deployments served, family stresses and other variables. For instance, Poulin says there is a high correlation between use of the drug Demerol and suicidality among veterans, which he suggests could be a downstream factor of dealing with chronic pain.
For now, the Durkheim Project will not include an intervention component, though Poulin hopes it will come once the project satisfies clinicians that it can be used as an effective predictive tool.
Data Privacy a Concern
Poulin acknowledges that people will have privacy concerns about the project's work. But he notes that the program is entirely opt-in (and out). The data will be stored at the Geisel School of Medicine at Dartmouth's onsite database. Additionally, sharing personal identifiable information with external/third parties is strictly forbidden by the study's medical protocol, which is safeguarded by HIPAA standards of medical privacy.
"We have created a secure data-storage environment behind the medical school's IT firewall to ensure participant privacy-both during this study phase and for any future interventions that may be indicated by the insights generated here," says Paul Thompson, study co-investigator and an instructor at the Geisel School of Medicine at Dartmouth.
"Suicide prediction and intervention is really tough, mostly for social reasons, not technical reasons," Poulin says. "We need to get past that stigma. We need to be a combination of caring and tough-caring enough to do it and tough enough to take the criticism you're going to get."
"It's much more privacy invading than your financial statement," he acknowledges of the project's data collection. "Suicide is a very private choice that you can't stop without being able to peel back the layers of the onion on a person's psyche.
Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn. Email Thor at email@example.com
Read more about big data in CIO's Big Data Drilldown.
- Hadoop for Dummies Today, organizations in every industry are being showered with imposing quantities of new information. Along with traditional sources, many more data channels and...
- The Top Five Ways to Get Started with Big Data Despite the increased focus on big data over the past few years, most organizations are still talking about what big data is rather...
- Data Warehouse Augmentation: The Queryable Data Store While organizations have, to date, been busy exploring and experimenting, they are now beginning to focus on using big data technologies to solve...
- The IBM Big Data Platform IBM is unique in having developed an enterprise class big data platform that allows you to address the full spectrum of big data...
- Live Webcast Best Practices: How to Improve Business Continuity with Virtualization VMware solutions include a range of business continuity capabilities to help ensure availability for applications across your virtualized environment. Learn More>>
- Cloud Knowledge Vault Learn how your organization can benefit from the scalability, flexibility, and performance that the cloud offers through the short videos and other resources...
- Endpoint Data Management: Protecting the Perimeter of the Internet of Things Not surprisingly, "Internet of Things" (IoT) and Big Data present new challenges AND opportunities for enterprise IT. Teams need to harness, secure and... All Data Center White Papers | Webcasts