Supporting Evidence-Informed Policymaking: A Case Study in Visualizing Twitter Networks

Claire Reinelt and I worked recently with Sarah Lucas at the Hewlett Foundation to help them better understand and support the emerging field of evidence-informed policymaking. We are excited to share this public report summarizing our findings and methods.

We hope this 20-minute video presentation will be interesting to you, especially if

  • you fund networks for social change,
  • you design communication strategies,
  • you conduct social network analyses, or
  • you work to increase evidence-informed policymaking.

The presentation is also available as this 11-page PDF, which has a complete transcript of our narration along with every slide.

If you have any comments or questions about this work, we would love to hear from you. You can comment on this post or email bruce@connectiveassociates.com.

These two supporting documents help you get the most out of the video presentation:

Conversations in Networks

In early September I started collecting #MeToo tweets and stumbled into a big-data look at #MeToo and the Kavanaugh confirmation. As I posted previously, the Twitter network is radically split into red and blue factions. (FiveThirtyEight subsequently wrote about this divide, based not on Twitter but on polls, and arrived at the same conclusion.) Now I want to post an update on this project, which is more than ever a work in progress. I’m especially wanting input from #MeToo movement organizers, who hopefully have real questions that can guide where this research goes next.

I am still collecting tweets. Here is an updated map, showing the same left-right split. The network appears to have a couple significant bridges and a “super-left” tail:

#MeToo twitter network of Sept 25-27

We know it’s been a momentous few weeks for #MeToo. Let’s look at the data. See how #MeToo tweeting spiked over 100,000 per hour on Sept 27, the day of Ford’s testimony.

Five million #MeToo tweets in one week.
There was a huge spike of #MeToo tweets Sept 27.
The grey bar indicates missing data from Sept 28. 

What tweets exactly are being counted? Over Sept 15-27 I curated a list of hashtags to track, aiming to capture #MeToo spirit without taking unnecessary Kavanaugh crossfire. During that time, #BelieveSurvivors grew from zero to the number one trending hashtag on Twitter of Sept 24. My final list of 20 “#MeToo hashtags” also includes #WhyIDidntReport, #BelieveWomen, #MenToo, #MeTooMvmt, #SurvivorCulture, and #HimToo.  Below is a word cloud showing all the top hashtags from the 5 million tweets charted above.

Word cloud for all #MeToo tweets Sept 25 – Oct 2

All these maps and charts are a nice start, but how can we better understand what’s happening on the two sides of this “conversation”? One way is to make separate word clouds, one for each side:

Separate word clouds for left and right network clusters

We can see some important differences based on these word clouds, like #HimToo on the right and #WhyIDidntReport on the left. But the differences are obscured by the overwhelming similarities. For example, barely a day after #BelieveSurvivors exploded on the left, it became just as huge on the right, and so both word clouds feature this hashtag prominently, which does not help us understand the differences between left and right.

Let’s look at this problem another way. We’ve got 5 million Tweets from 1.5 million users. Based on network clusters, we can categorize many (maybe most) of those users as “left” or “right.” What happens if we make one bucket of tweets from known “left” users, another bucket of tweets from known “right” users, and then teach a computer program to recognize the difference between a “left” tweet and a “right” tweet? If we succeed, then we can use that computer program to score any #MeToo tweet on left-vs-right partisanship, including tweets from unknown users and without even drawing a network map.

We have formulated a classic problem of machine learning. Skipping some technical detail, we train a classifier to recognize our two categories of #MeToo tweets with roughly 87% accuracy. Not bad. If we crack open the resulting classifier, we find model coefficients that tell us exactly which words are most strongly associated with each side of the #MeToo divide. The bigger the bar, the more influence it has on our “prediction”:

The words listed above do not have any extraneous hashtags that are popular on both sides. We are looking at the most significant single-word indicators that a tweet is either “left” or “right.” The top two and bottom two make perfect sense. The left champions #SurvivorCulture and #StopKavanaugh. The right champions #HimToo (a cry to protect men from false accusations) and #ConfirmKavanaughNow. Some words included in the list are not obviously partisan (#world) and we’d want to do more model-training if we were really serious about classifying lots of future tweets very accurately.

Let’s run with our first-draft model for now. With it, we can actually compute, for any #MeToo tweet, the probability that it’s left or right. If a tweet scores 0.0001, then it’s almost certainly left, and it it scores 0.9999 then it’s almost certainly right. If we can score tweets this way, then we can aggregate tweet scores user by user and estimate how far each individual leans left or right (on a zero-to-one scale), based on what they’re literally saying and without having to bother with a map. Below we see a curve of tweet scores based on Sept 25-27.

The rainbow in the chart above shows how we assign a color to each score value from zero to one. This will be handy when we start assigning scores to nodes and edges in network maps.

Based on the distribution above, let’s consider a more nuanced classification than the binary “left” vs “right.” I’ve proposed four categories, and selected 2-3 of the most-retweeted examples within each category. It looks good at the far ends, with a miss or two in the mid-left and mid-right.

Far Left: ~500K tweets scoring 0.0-0.15:

i was raped at Yale. i was groped at parties in dke’s house—#kavanaugh’s fraternity at yale—and was told as a freshman to avoid their “rape basement.” multiple dear friends were raped by yale dke brothers & by boys from elite prep schools. i believe ramirez. #believesurvivors

by scheduling a vote on judge kavanaugh before dr. ford has even testified, senate republican leaders are saying loud and clear: they don’t care what she says. #believesurvivors

mr. president, enough. a supreme court nomination is not worth more than the lives of survivors. there must be a full investigation of these allegations of criminal behavior, and judge kavanaugh’s nomination must be withdrawn.

Mid Left: ~300K tweets scoring 0.15-0.35

ladies, a question for you: “what would you do if all men had a 9pm curfew?” dudes: read the replies and pay attention. #metoo #kavanaugh #cosby #feminism #maleprivilege #privilege

tune in as democrats show our support for dr. christine blasey ford. #believesurvivors

so, the same party that wants to force teenage boys and girls to shower together in the name of transgender rights is also leading #metoo against sexual predators?

Mid-Right: ~200K tweets scoring 0.35-0.75

modern feminism has never been about equality with men.
it has always been about special treatment and exemption from all responsibility. many condemned me for being one of the first to speaking out against #metoo. now it’s toxicity is on full display. #defendourboys

you can like or not like @michaelavenatti but what he just put out is a sworn affidavit alleging that kavanaugh and mark judge regularly gang raped women including once his client julie swetnick. i believe survivors.

it’s all about #metoo & #webelievesurvivors unless the survivors support @realdonaldtrump or the sexual predator is a democrat. ain’t that right @keithellison @maziehirono @senfeinstein & @billclinton @dnc the party of hypocrisy 

Far Right: 100K tweets scoring 0.75-1.0

i’m loving the hashtag #himtoo. it appears to be a movement built of men who have had their lives and families destroyed by false allegations and a lack of due process. radical feminism has become problematic and needs to be addressed. dr. luke, brett kavanaugh… #himtoo

serious question. are keith ellison, sen. sherrod brown, sen. booker and sen. tom carper signing on? i know they’re democrats but thought it’s only fair to ask given their history’s on this subject.

Coming Soon…

With a good scoring and coloring system, along the lines described above, we can apply those colors to every node and edge of a twitter map, and see “exactly” where left- and right-leaning discussions are happening, along with some shades in between the extremes. Something like this:

Prototype #MeToo Twitter network map,
With color spectrum to indicate extent of left vs right expression.

Let me know what you think. I am especially interested in movement organizer folks who have suggestions for improving the relevance and usefulness of this method to provide them with actionable information.

MeToo in the news

I just finished collecting one week of #metoo tweets, coincidentally starting just before the explosion of Kavanaugh SCOTUS controversy. Believe it or not, I thought I had picked a relatively quiet time when nothing in particular was rocking the #metoo conversation. Wrong! A few days and half a million tweets later, here is my preliminary result. A network map showing how people are connecting with each other over #metoo.

The striking feature of the network is, of course, how dramatically it is split into two sides. I analyzed the retweeting separately for each side of the network, and displayed the most-retweeted message from each side in the picture. Behold, the distressingly familiar red-blue divide.

I have just started analyzing and making meaning of the data. I am curious what questions you think are worth asking. Some questions on my mind are:

  • Is the “baseline” #metoo conversation that I was looking for in here somewhere?
  • Who are the bridgers that connect across the big divide? What are they saying?
  • What else is going on here besides red vs blue? In particular the bottom/blue portion appears to have multiple factions.

Also, what does this mean for #metoo? How will the good non-partisan intentions of the movement survive the ripping forces of polarized American politics?

Let me know what you think and I look forward to doing more with this data.

Ill-Defined is Good for You (aka Innovation, Adaptation, and Recovery: Part 2)

Even if you’re not mathematically inclined, I hope you’ll appreciate these reflections on the benefits of ambiguity in a creative organization…

I think one of the reasons I gravitated towards mathematics and computer science years ago is because I love definitions and defining things. A good definition not only creates a clear boundary of what something is and is not, but it also offers insight into what that something is good for. Defining is definitely an art.

Most of the academic papers I have read feature an entire section devoted to definitions. The “Definitions” section is always the most tedious part of the paper to read. But without it, the whole paper would stand on shaky ground. By carefully crafting his definitions, the author makes his paper consistently meaningful to a variety of readers with different experiences and expectations. Provided those readers bother to read the definitions, of course.

One odd lesson I learned writing my first technical paper was that even though the definitions section is traditionally placed near the beginning of a paper, the actual writing of the definitions is best postponed until every other section is complete.

Writing the definitions last might seem like laying the foundation only after topping off the roof of a new house. But there is a big difference between writing a paper and building a house. Whereas the architect specifies ahead of time just how the construction of a house will end, the author of a paper doesn’t quite know where it’s going until he’s done. And only after the last insight has been revealed does the author know enough to define exactly what his paper is about.

Any creative project will evolve through a similar period of mystery. So it’s no surprise that when the creative process happens individually, the creator may for a time be unable to explain what is happening, even if he wanted to.

But what happens when the creative process happens in a team? How can the team members collaborate clearly when the very nature of creativity blurs the process?

In his book “Six Degrees,” Duncan Watts turns this question on its head. He explains clearly and convincingly how chronic low-grade ambiguity in a creative organization can offer powerful benefits. As co-workers collaborate toward a shared but vague goal, they renegotiate roles and responsibilities at every turn. This seeming burden actually instills the organization with the ability to adapt and recover even in the face of extraordinary challenges.

As I reviewed in a previous post, the Toyota-Aisin crisis is a remarkable case study of how a creative and relatively ill-defined organization spontaneously recovered from a catastrophe that easily could have ruined it.

New York City after 9/11 provides a similar example, also described by Watts. From the chaotic everyday process of running New York, there emerged a spontaneous collaborative recovery that returned most of Manhattan to business as usual within days. No one person knew what to do, but the experienced professionals of New York instinctively knew who to talk to, and the recovery emerged from their collective response.

Ironically, New York City did have an offical recovery plan, but it failed miserably. For example, the official emergency command control bunker was completely buried by the collapse of the World Trade Center. But no matter. The everyday business of running New York had always been collaborative improvisation, with or without a central plan.

So those of us with an urge to end ambiguity by precise definitions should be careful. In the ongoing life of a creative organization, mystery is not only the indescribable essence of creating new products and services, it is also the critical ingredient that enables collective adaptation.

If you’ve made it this far and are curious to read (a lot!) more about the fascinating relationship of definitions and creativity, I highly recommend “Le Ton Beau de Marot” by Douglas Hofstadter.

In the meantime, I’m going to take my next post or two away from this philosophical tack and steer more pragmatically into the nature of creative organizations.

Innovation, Adaptation, and Recovery: Part 1

Duncan Watts covers a lot of ground in his book Six Degrees, including bits of graph theory, computer science, physics, sociology, and epidemiology just for starters. Today I want to mention one section of his book that is particularly relevant to business and organizational effectiveness.

The Toyota-Aisin Crisis

Although it appears to be one large company, Toyota actually consists of many small companies knit together in a tight but informal collective. These separate companies rely on each other to bring together all that is required to build the cars and trucks of Toyota, and they do so without a central authority telling each one of them what to do.

For those of us raised to think of big companies as monolithic hierarchicies, the mere existence of Toyota is an eye-opener. Toyota demonstrates (very successfully!) that there are more ways to organize a business than the command and control pyramid of the traditional org chart.

For others more attuned to business trends, challenging the reign of the corporate hierarchy is old news by now. Pyramids have been flattened and command chains diversified far and wide, as corporate standards of responsiveness rise ever higher.

Toyota is a particularly valuable case study not only because of its non-heirarchical structure, but also because it faced a much more severe challenge to its responsiveness than most companies have ever experienced. One of the companies in the Toyota group lost its manufacturing plant to a fire. The company, Aisin, was reponsible for making a particular piece required in all Toyota vehicles. For Aisin to rebuild its manufacturing capacity would take months, but for Toyota to wait that long was completely out of the question. The entire Toyota production would shut down in just a few days without an alternative to Aisin.

Remarkably, Toyota recovered. Within three days, the collective companies of Toyota had arranged alternative means to produce the required part. Even more remarkable was how the recovery happened. With no one in charge and no contingency plan, Toyota employees relied on what they did have — longstanding relationships across a complex network of different companies. As soon as news of the catastrophe hit, each employee took action as best he or she knew, and from those efforts across that network emerged a plan to keep Toyota in business.

How did these people know what to do? Simply put, they just did the same thing they’d always been doing. These employees regularly faced the challenges of negotiating roles and responsibilities while simultaneously solving complex problems. The Aisin fire was a more urgent problem than normal, but one that called on exactly the same skills and resources as developing a new car. This catastrophe was almost business as usual.

Next time I’ll continue this thread and discuss Watts’ analysis of the relationship of innovation, adaptation, and recovery.

Six Degrees: The Science of a Connected Age

Six Degrees by Duncan Watts provides a lively overview of the emerging science of networks within the context of the lives of the scholars who are making it happen.

I was immensely predisposed to like this book, and it did not disappoint me. Like the author, I studied networks and received a PhD at Cornell University in the mid-90s. Six Degrees provided me a perfect introduction to how graph theory and network algorithms (which I have studied in depth) relate to sociology, psychology, biology, and physics. For those without formal mathematical training, Six Degrees will probably be tough going at times, but Watts consistently brings the story back to real people to keep things interesting.

For those interested in pursuing networks and their applications, Watts provides a fantastic bibliography. Book titles are organized by subject and each title receives a “degree of difficulty” rating to help guide the search.

As I mentioned in my last post, Watts takes on some similar issues addressed by Gladwell in The Tipping Point and arrives at slightly different conclusions.

Both authors take great interest in how social phenomena spread from individual behavior to epidemic proportions. By describing the importance of connectors, mavens, and salespeople, Gladwell paints a picture where careful consideration of the social landscape makes all the difference between sales flop and hot fad. Watts incorporates Gladwell’s thinking into a more scientific point of view, and then shows that even if we have a great product and know who the connectors, mavens, and salespeople are, the boundary between flop and fad is invisible and often completely outside our control.

The crux of Watts’ argument is that everyone has a threshold for adapting new behavior. Some people are avid trendsetters and jump at new styles and products impulsively. Others refuse to jump on new trends no matter what. The rest of us fall somewhere in between, each person somewhere on the spectrum from impulsive trendsetter to die-hard traditionalist. We don’t jump first, but we do jump eventually, or at least some of the time. We instinctively wait until a certain percentage of our associates jump, and then we become open to the idea of following. Each of us has a different threshold. (Watts is not making this up. He cites a great deal of sociological and psychological research.)

Given this psychological model (which is disturbing but hard to refute) what happens in a social network filled with many interconnecting relationships? The answer is that if there are too many interconnections, then new trends cannot jump to become fads. Except for the impulsive trendsetters, everyone else knows too many people holding back, and so everyone holds back.

Gladwell’s model of connectors, mavens, and salespeople still applies in a social network with relatively sparse interconnected relationships. But as the level of interconnectedness increases, the boundary between flop and fad becomes increasingly invisible and impossible to predict, until at some extreme level of clique-i-ness, group-think becomes entrenched and change is all but impossible.

Watts also gives a fascinating description of how the chaotic unpredictability of networks can save us in a time of crisis. I’ll say more about that next time.

The Tipping Point

The Tipping Point is one of the more entertaining reads I’ve found lately. Author Malcolm Gladwell explains how little things can make a big difference when it comes to fashion trends, disease epidemics, crime waves, and other aspects of societal behavior.

Gladwell points to three kinds of people who are especially influential in the dynamics of a connected world. Connectors keep in touch with many, many more acquaintances than the average person. Mavens maintain an encyclopedic awareness of the ins and outs of products and services in the marketplace. Salespeople have preternatural powers of persuasion. The difference between an isolated baseball cap and an unstoppable fashion trend (for example) hinges critically on the effects of these three types of people, even moreso than it depends on the intrinsic merits of one hat vs another.

Two points really stuck with me after breezing through The Tipping Point.

First, Gladwell discusses the 80s crime wave in NYC, which was turned around in the 90s thanks to some seemingly minor housecleaning. By cracking down relentlessly on subway graffiti and fare-cheaters, NYC communicated a subtle but important change in its law enforcement attitude. The clean cars and vigilant token-collecting continually reinforced the attide that NYC takes care of its responsible citizens and does not tolerate crime. The result was a chain reaction of respectable citizens taking back the subways and neighboorhoods, and it was these people who played the biggest role in ending the crime wave.

This story of NYC law enforcement convinced me to enact my own crack-down on office clutter. A clean in-box and polished desktop may not directly bring my job-hunting and research interests into clearer focus, but they do start a powerful chain reaction.

Second, Gladwell addresses some of the questions raised by Leonard (see my previous post). Does e-mail threaten to end the influential reign of connectors, mavens, and salespeople? Gladwell answers a resounding no. As we become increasingly bombarded by e-mail solicitations from friends, friends of friends, and friends of friends of friends (etc), we quickly learn to tune out all but the most important information. And how do we ultimately decide what is important to us? The same way we always have, based on our values and relationships, which we develop in concert with our friends and colleagues — especially the connectors, mavens, and salespeople.

In his excellent book Six Degrees, Duncan Watts points out some limitations to Gladwell’s thinking. I’ll talk about that next time.

The Sharp Edges of Networking

Today I read a great two-part article in Salon.com about social networking software, entitled “You Are Who You Know” by Andrew Leonard.

Leonard raises challenging questions about the world of Orkut, LinkedIn, and other social networking tools. Are people flocking to these tools in the vain hope that computers can somehow overcome ordinary human obstacles to building relationships and communities? And what about privacy? The information we willingly submit to these digital communities is enough to make sociologists and market strategists salivate, and we all stand to benefit, but are we giving up too much in the deal?

Leonard’s pointed questions bounce in my mind off recent memories of three excellent books on various aspects of networks: Bowling Alone by Robert Putnam, Tipping Point by Malcolm Gladwell, and Six Degrees by Duncan Watts. I’ll reflect more on the connections between these authors in future posts.