The union project uses the twitter v2 api research endpoint, the covid projects use the covid-19 twitter dataset, and the elections and ferguson projects use Archive.org's twitter datasets for those months.
Pick a project from the top navigation to see its pre-generated network visualizations:
To create the networks, we searched the 110 Million contained in the COVID-19 twitter dataset for the specified search terms. For every search term found in a tweet, we kept track of the other words that co-occurred in that tweet (after removing relatively meaningless stopwords e.g. 'and, the, or etc.'), creating a link between the search term and the word. If that link already occurs, we increment its count. We also keep track of the top tweets (based on popularity (calculated by number of retweets + number of likes)) for every link and every word in this network
Once every tweet has been searched, we take the top co-occurring words for each search term, such that the total number of words in the network should be roughly 1000, and visualize them in a node-link structure. For graph readability we remove a few of the less frequent links.
Click on a node or a link to see the top tweets that use those words or that pair of words.