Public Discourse in the U.S. 2020 Election: Resources & Data
The definitions and data below inform the reports from the Public Discourse in the US 2020 Election team. Story data was collected using Media Cloud, Crowdtangle, and Brandwatch.
Download the Data
January and February Story URLs (.csv)
Download .csv files to explore the January and February data. This .zip file contains 16 .csv files from both January and February, including: top inlinks across all media, left, and right-leaning outlets; stories shared from Biden, Sanders, and Trump supporters on Twitter; and the top unique links and most engagement data on Facebook.
March, April, and May Story URLs (.csv)
Download .csv files to explore the March, April, and May data. This .zip file contains 24 .csv files from both March, April, and May including: top inlinks across all media, left, and right-leaning outlets; stories shared from Biden, Sanders, and Trump supporters on Twitter; and the top unique links and most engagement data on Facebook.
Media Source Ideology Scores (.csv)
Download a .csv file of over 13,000 online media sources and their ideology scores. See the Media Outlet Ideology Scores & Methodology section below for more information.
Key Definitions
Inlinks: “Inlinks” refers to the incoming cross-media hyperlinks to stories and media sources.
Cohorts: “Cohorts” are sets of Twitter users that align with specific candidates. To delineate these user cohorts, the team relies on the users’ retweeting of candidates, which is a strong signal of political affinity. Drawing on data collected during the month of October 2019, the team randomly selected a set of 1000 users that had retweeted Trump, Warren, Biden, or Sanders at least twice during the month, respectively, to generate four cohorts of Twitter users. Monitoring the media source sharing patterns of these user cohorts, the team sees how attention to media sources differs between Trump supporters on Twitter and those of the major democratic candidates.
Media Outlet Ideology Scores & Methodology: Each media source is assigned to one of five quintiles according to their estimated ideology score: left, center-left, center, center-right, and right. In describing the distribution of attention to media across the political spectrum, media sources are divided into 20 bins. To estimate these media ideology scores, the team starts by calculating the relative position on the political spectrum of approximately 15,000 Twitter users that were active between January 2019 and June 2019. This process is based on emIRT. This estimation technique produces a continuum of ideology scores but no internal point of reference to help ascertain how these scores align with common understandings of political valence, e.g. where the center point is or how one might distinguish users on the right and left from those in the center. In order to do so, the team divides users into two groups, right of center and left of the center. The center point is derived from the intersection of users that self-identify in their profile as being either liberal or democrat on one side and conservative or republican on the other. The final step is to tabulate the sharing of media sources by users on the right and compare that to users on the left to generate a continuous metric on a -1.0 to 1.0 scale. The center point, 0.0, denotes an equal share of users on both sides sharing a media source in a month. A score on the far left, -1.0, means that a media source is shared only by users on the left, and a score of 1.0 would denote shares only by users on the right. These proportions are used to create quintiles, which we use to describe the ideological position of media sources and to color media sources on maps. For media sources in the center, stories are shared at similar rates by users from the left and right, the center-right and center-left are shared at 2:1 and 1:2 ratios of right to left, respectively, and the right and left are stories that are shared at at least ratios of 4:1 and 1:4.
Facebook unique links and total interactions: The most common way in which researchers and digital media watchers report on Facebook activity is total interactions. This sums up all the engagements by users associated with the posts across different accounts that include a given resource, which might be the URL to a particular story or text that matches a search query.
Another way to assess the popularity of a story, video, or image on Facebook is to monitor how many different Facebook pages and groups post the content, which we refer to as unique links. This metric will better reflect the extent to which a story spreads across the Facebook platform outside of the places that core supporters frequent. The metric is based on unique posting of story links, such that each page or group only counts once even if they post a story many times, and deliberately discounts the intensity of activity on the pages of media outlets.