What is CANDIDATA24?

CandiData 2024 is a database containing social media handles and content belonging to politicians and their campaigns. In 2024, CandiData is focused on:

  • U.S. federal incumbents

  • U.S. federal candidates

  • A selection of state candidates

CandiData includes social media handles from mainstream and alternative platforms: 

  • Facebook

  • Instagram

  • Threads

  • TikTok

  • X

  • YouTube

  • Gettr

  • Rumble 

  • Telegram 

  • TruthSocial

Our data also includes FEC candidate IDs, party identification, incumbency status, district competitiveness (using 2024 Cook Report scores for House candidates and using 2020 presidential vote share by state for Senate candidates), race, and gender for all candidates where available. It also includes birth year, Bioguide IDs, ICPSR IDs, and DW-NOMINATE scores for incumbents running for re-election in 2024. 

The data is available two formats: 

Public dashboard: Aggregated content from our dataset, allowing users to query for specific terms across platforms 

Dataset archive: Data underlying the public dashboard, accessible using SOMAR’s data enclave structure

Why is the CandiData database important?

Over the past two decades, social media has become an integral part of the U.S. information ecosystem and, by extension, pivotal spaces for discussion during an election. 

For politicians and political campaigns, platforms are essential for engaging with different voting blocs, promoting events, and raising funds. To do so, politicians take a multi-faceted approach by producing advertisements and organic content across social media platforms. 

Owing to the complexity of the information ecosystem (including, but not limited to, social media), journalists have a challenging task ahead of them in the 2024 U.S. elections: how can reporters track and report on politicians when their campaigns are posting and communicating across dozens of platforms?

Scholars of American elections, campaigns, and political communication have systematically studied social media posts by politicians since they started using the platforms for their lawmaking and campaigning efforts. However, the necessary first steps of this work (curating lists of politicians and social media handles) have been undertaken by researchers separately– an inefficient and time-consuming use of resources for each social media platform. How can researchers spend their time and resources analyzing politician behavior online, rather than curating lists of candidates, handles, and posts themselves?

Despite the importance of communications happening on social media, platforms have made it more challenging for journalists and civil society to access such data. Platforms are constantly changing platform data access policies, including increasing costs of API access (e.g., X/Twitter), the growing trend toward policies that shut out journalists (e.g., Meta, TikTok), or the full closure of APIs and archives that once made content easier to track (e.g., Pushift’s Reddit archive). 

The cumulative consequences of this cannot be understated: during an election year, with democratic institutions at stake, it is imperative that journalists, civil society, and researchers have the resources and infrastructure necessary to report on politicians’ discourse.

How did we collect this data?

The Center for Tech and Civic Life, as a CandiData partner, collected candidate and handle data across Facebook, Instagram, Twitter/X, and YouTube. A team of CandiData research assistants collected additional handle data for TikTok, Getter, Rumble, Telegram, and Truth Social, pulling links from politicians’ office and campaign websites, and/or through search. 

Our data contains handles for office accounts (for incumbents) and campaign accounts (for all candidates). 

A team of staff members, including partners at The Carter Center, validated each handle surfaced by a research assistant. 

Our data also includes FEC IDs, party identification, incumbency status, district competitiveness (from 2024 Cook Report scores) and 2020 presidential vote share, race, and gender for all candidates where available. It also includes birth year, Bioguide IDs, ICPSR IDs, and DW-NOMINATE scores for incumbents elected in 2022 or in a special election since.