Heading Into Election Night, Coakley Still Leads Baker In Governor's Race (On Twitter)

Since the gubernatorial debate between Democrat Martha Coakley and Republican Charlie Baker on Oct. 21, a number of polls and media outlets have reported that Baker has gained an advantage in the campaign and some project him as very likely to win. While the outcome will be decided later tonight by tallying actual votes, the race on Twitter thus far has been won by Coakley, which provides a real world experiment for Twitter’s predictive capacity in offline events like elections.

In the two weeks following the Oct. 21 debate through about 10 a.m. this morning, Coakley has been mentioned on Twitter 36,155 times. Over the same period, Baker has been mentioned in 24,443 tweets.  This is a difference of 11,712 Twitter mentions, or a percent increase of 47.92 percent for Coakley, which is nearly identical to the advantage Coakley had shown in an earlier analysis reported on WBUR's Poll Vault. During this frame, there was only one day—Oct. 29—where Baker was mentioned more often than Coakley on Twitter, and this difference was a total of 164 tweets. The numbers of mentions per candidate per day for the latest time period are visually graphed in Figure 1.


Of course, looking at raw figures does not encapsulate what was being said, or if either candidate was being mentioned in a positive or negative manner. One critique that is often leveled at this sort of analysis is that it does not take into account sentiment, which is one of the more difficult concepts for humans or software to judge accurately, particularly in political texts. Nonetheless, British researchers working in this area developed a program known as SentiStrength  for the explicit purpose of detecting sentiment strength in short texts. This program ranks the overall sentiment of short social web texts (in this case, tweets) on two dimensions: negativity and positivity. Based on classifications and ordering of words, SentiStrength produces a measure of negativity from -1 (not negative) to -5 (extremely negative). It also constructs a similar measure of positivity from +1 (not positive) to +5 (extremely positive).

When applied to the body of tweets from Oct. 21 leading up to Election Day, the average negativity score of tweets mentioning Baker was -1.48, which was nearly identical to the average negativity of -1.49 in Tweets that mentioned Coakley. Positivity measures were likewise nearly inseparable, with Baker averaging +1.43 and Coakley slightly behind with an average positivity rating of +1.37 in tweets where she was mentioned. In short, the sentiment of tweets about either Baker or Coakley are not, by and large, explicitly highly negative or extremely positive.

When considering that both candidates are roughly equivalent in terms of negative and positive content, insofar as it can be detected in software, it worthwhile to again return to the idea of influence on Twitter. Previously, only Twitter users that mentioned the hashtag #mapoli to introduce their tweet as being relevant to Massachusetts politics were included. Since that was artificially limiting other, likely important content from analysis, the additional search terms of: charlie baker, charlieforgov, coakley, magov, magov14, mapoli, marthacoakley, mass politics, mass. politics and masspolitics were added to collect any tweets that mentioned any of these keywords from Oct. 24 onwards. In that time, 63,843 tweets were gathered using the Boston University Twitter Collection and Analysis Toolkit.

Much like the previous work already reported on Poll Vault, network analysis and algorithmic sorting were applied to this body of tweets to identify the most influential users among the 1,000 most active people that are using these specific keywords when tweeting. Just as found earlier, the Coakley account (@marthacoakely) is the most influential user account on Twitter in this more broader but still relatively specifically identified user group of citizens on Twitter that are talking about Massachusetts politics, as shown in Figure 2.


This graph not only identifies a broader spectrum of Twitter users talking about Massachusetts politics, it also signals some of Baker’s influence on Twitter—now ranked third behind Coakley and ‘RightWingWatch Fan’—despite only tweeting 39 times using the keywords most closely related to the campaign for governor. In other words, unlike Coakley, who was much more active in mentioning others in her 431 tweet in this dataset, Baker’s influence is almost entirely the result of other users mentioning him 11,251 times—still less than Coakley’s 12,711 mentions but a substantial number nonetheless.

These different approaches in social media strategies for the campaign aside, this graph also indicates a strong and noticeable polarization of users on Twitter talking about Massachusetts politics. In the run up to Election Day, those users posting in these spaces on Twitter have decidedly clustered into camps based largely on ideology in terms of who is mentioning whom. In other words, the discussion online is increasingly insular and those from the right and the left are talking mostly to one another, not each other, which suggests there is a certain level of real-world offline validity to these analyses.

Moving forward in the coming hours, the extent to which Twitter can predict political outcomes will be laid bare, at least in this specific race. Data can only lead so far, but based on the race for governor as it has taken place on Twitter — and despite the prevailing wisdom of pollsters — Coakley is positioned to come out on top.

Jacob Groshek, PhD, specializes in technology and political communication as an Assistant Professor in the Emerging Media Studies Division at Boston University.  He also directs Betweetness Labs as a platform to make access to the TCAT system available for users off-campus.


More from WBUR

Listen Live