Get breaking news alerts via email

Click here to manage your alerts
Can Twitter help predict epidemics?
First Published Mar 06 2013 11:06 am • Last Updated Mar 06 2013 11:08 am

Twitter users send around 500 million tweets a day, an endless fire hose of information about how people feel, what they’re doing, what they know and where they are.

For epidemiologists and public health officials, it’s a potential gold mine of data, a possible way to track where disease is breaking out and how it spreads, as well as how best to help - but only if they can figure out how to find the useful signal amid all that noise.

Join the Discussion
Post a Comment

"The question is: How do you take these billions of messages, find the useful information and get it to people who can respond?" says Mark Dredze, an assistant professor of computer sciences at Johns Hopkins University, who studies computational linguistics.

That’s a very big question, one whose difficulty has pushed many researchers away from the idea of using Twitter data, which they say is too messy and too uncontrolled compared with traditional methods of collecting health data, such as surveys and analyses of hospital visits. Others argue that, once we learn to effectively harness the data, Twitter’s very messiness (including the impulse to tweet what you had for breakfast or how annoying your runny noise is) will be what makes it an invaluable resource.

"It’s like a pulse on the world, because people will just tweet whatever, whenever," explains Christophe Girraud-Carrier, an associate professor of computer science at Brigham Young University, who studies what he and his colleagues have dubbed "computational health science." "Poll answers are filtered by perception or memory; on Twitter, we’re actually observing real behavior" in real time.

Using Twitter data has other advantages, Dredze says. For starters, it’s faster: It can take the Centers for Disease Control and Prevention about two weeks to publish findings, Dredze says. Those numbers can additionally be delayed by the fact that a sickness doesn’t show up in statistics until someone goes to the hospital or does something else that causes the ailment to be reported.

Twitter, on the other hand, might reflect it the first morning someone wakes up with a sore throat. Speed can be a big advantage when tracking epidemics and emerging diseases, says Taha Kass-Hout, director of the CDC’s Division of Informatics Solutions and Operations. "An emerging disease from Southeast Asia can be in your backyard in 12 to 14, maybe 24 hours. So you have to respect that."

Twitter can also provide a more detailed picture of where disease is breaking out, since many tweets are tagged with their locations. That, coupled with faster data, could help keep hospitals and clinics from getting overwhelmed in the middle of an outbreak: Even a few days’ notice that disease occurrences are spiking can mean being prepared with extra beds, staff or medicine. Detailed, location-specific data can also identify clumps of noncommunicable diseases - cardiovascular disease or Type II diabetes, for example - allowing health officials to focus education efforts in the areas that need it most.

Twitter is also in increasingly wide use, including in countries that don’t have effective public health tracking agencies. "In that case, anything Twitter can provide - whether it’s fast, slow whatever - is really valuable," Dredze says.

Those advantages, coupled with the fact that researchers are getting better at tracking and analyzing useful information, mean that "consensus is forming in the public health and health-care communities that we really need to pay attention to social media," Kass-Hout says. However, he stresses that social media information is "a complementary tool, rather than a replacement" for more traditional methods of gathering information. It also depends on validation, the ability to prove that data collected through Twitter have real-world accuracy. That was one goal of Dredze’s research: to confirm the utility of Twitter data by studying if tweets about the flu could be filtered in such a way that they tracked with official flu rates.

story continues below
story continues below

Central to that effort is the signal-in-the-noise question, the effort to find and isolate useful information amid the barrage of tweets. In May 2011, Dredze and his colleagues were using a computer program to monitor mentions of the flu on Twitter. Suddenly, there was a massive spike in chatter. "It didn’t make any sense to us," Dredze said. "The flu season was pretty much over." They drilled down and discovered that people were discussing the fact that Kobe Bryant of the Los Angeles Lakers had played a game while sick.

That information may be interesting to basketball fans, but it’s not the kind of news that health researchers are looking for.

Dredze and his colleagues decided they needed a better algorithm, one that would allow the program to filter out tweets that aren’t actually about people having the flu. Their system starts by searching for some key words (such as "flu," "fever" and certain brands of medicine) and screening out others (including "Bieber" with "fever" is a good sign that someone’s not talking about having the flu; so is including a URL, since it probably means they’re simply sharing an article), then applying grammatical analysis to figure out whether someone actually has the flu or is just talking about it. (Is "flu" the subject or the object of the verb? Which verbs are used? Which pronouns?)

They tested the system when reports of the latest flu epidemic hit the media in January. The number of tweets mentioning the flu shot up, though most of them didn’t reflect actual cases. But when Dredze and his team filtered tweets through their algorithm, they matched the CDC’s findings about actual flu rates.

Meanwhile, another key problem - underrepresentation of certain demographic groups, including the very young and the elderly - is diminishing rapidly as Twitter use expands, Kass-Hout says. Likewise, research is beginning to show that location data is indeed accurate enough to be of statistical use.

That leaves researchers and public health officials pondering the possible applications of Twitter research - for example, using tweets to map urgent needs in the wake of natural disasters or to determine where vaccines are most needed following an outbreak.

Another possibility is using Twitter to better understand and respond to health-related behavior. For instance, Dredze says, the Johns Hopkins study turned up evidence that "a significant percentage of people who had the flu mentioned antibiotics"— a troubling finding since antibiotics don’t cure the flu, a virus, and their misuse can increase drug resistance. Knowing just what misinformation they’re combating can help officials better target educational efforts.

Giraud-Carrier’s work revealed details of prescription drug abuse; he and his colleagues also studied whether algorithms can be created that flag cases of potential suicide or domestic violence before they happen.

"I don’t want to just be listening and finding out all these bad things that are happening," he says. "In the long run, our vision is to do something more than listen."

Next Page >

Copyright 2014 The Salt Lake Tribune. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.

Top Reader Comments Read All Comments Post a Comment
Click here to read all comments   Click here to post a comment

About Reader Comments

Reader comments on sltrib.com are the opinions of the writer, not The Salt Lake Tribune. We will delete comments containing obscenities, personal attacks and inappropriate or offensive remarks. Flagrant or repeat violators will be banned. If you see an objectionable comment, please alert us by clicking the arrow on the upper right side of the comment and selecting "Flag comment as inappropriate". If you've recently registered with Disqus or aren't seeing your comments immediately, you may need to verify your email address. To do so, visit disqus.com/account.
See more about comments here.
Staying Connected
Contests and Promotions
  • Search Obituaries
  • Place an Obituary

  • Search Cars
  • Search Homes
  • Search Jobs
  • Search Marketplace
  • Search Legal Notices

  • Other Services
  • Advertise With Us
  • Subscribe to the Newspaper
  • Access your e-Edition
  • Frequently Asked Questions
  • Contact a newsroom staff member
  • Access the Trib Archives
  • Privacy Policy
  • Missing your paper? Need to place your paper on vacation hold? For this and any other subscription related needs, click here or call 801.204.6100.