How your tweets are used to measure happiness, health and other human conditions

By  | 

Since its inception in 2006, twitter has exploded in popularity to its current 310 million users, each daily tweeting their thoughts in classic 140-character form. There aren’t many people left in the world who haven’t tried twitter yet, but there are many who may not realize the volume of tweets used regularly in research, often comprising the data scooped up for scientific studies on nearly every aspect of human interactions.

How are Tweets Gathered for Scientific Study?

Twitter’s Application Programming Interface allows searching for keywords, information on users and other specific criteria within an enormous amount of open data, making it especially attractive to researchers of many kinds. Since 2008, for example, the University of Vermont’s Computational Story Lab has gathered data through something called the “Gardenhose” API on Twitter, which randomly samples ten percent of public tweets and streams them in real time.

What Kinds of Studies are Being Conducted with Tweets?

Through this interface, CSL has created several other useful measurement tools for human activity by region. For one instance, the hedonometer gathers geolocated tweets sent via smartphones to determine the ratio of positive to negative words used in content from various areas of geography. After carefully scrutinizing the results, one example of work conducted via the hedonometer has revealed America’s happiest state was Hawaii for the study’s year of 2013 (which researchers shared was a surprise to no one.)  Others have determined how varying amounts of user travel indicates levels of overall contentedness versus the inclination to stay home. In addition, researchers at Microsoft were able to use the tweets of pregnant women in order to analyze words and phrases linked to emotion in hopes to understand more about postpartum depression predictors.

Not only is happiness measured by CSL’s tools, however, as there are numerous other human conditions on their agenda. The lexicocalorimeter is used to determine fitness and overall health in different regions based on the content of tweets. High-calorie foods mentioned in tweets is measured by the lexicocalorimeter against physical exercise uttered in others of the same city to arrive at an overall health snapshot of the area at a certain time. Other tools are being created to use in conjunction with mathematical models in order to effectively predict disease outbreaks using tweets, such as Northeastern University’s Institute for Scientific Interchange’s FluOutlook platform. Finally, the U.S. Geological Survey has found ways to use tweets to predict earthquake locations based on the reports of regional tremors in content.

The Future is Bright and Tweet-Worthy

Improvements are, of course, necessary in ensuring conditions of bias-free work, and using proper ethics are essential for researchers working with this prolific public data-gathering center. With more than five hundred million tweets being shared daily with the world on twitter, it appears there’s no end in sight for the scientific community’s data gathering from the social media giant. There’s no need to worry if you use twitter at this point, though, as ethical research is allowed to use data only while keeping the users’ identifying information private—so tweet freely and often. You never know when your tweets might be contributing to important research!