Twitter

An exploratory look at 257,093 JeSuisAhmed tweets

#JeSuisAhmed Had some time last night to do some exploratory analysis on some of the #JeSuisAhmed collection. This analysis is from the first tweet I was able to harvest #JeSuisAhmed to some time on January 14, 2015 when I copied over the json to experiment with a few of the twarc utilities. First tweet in data set: #JeSuisAhmed Reveals the Hero of the Paris Shooting Everyone Needs to Know by @sophie_kleeman http://t.

JeSuisCharlie images

Using the #JeSuisCharlie data set from January 11, 2015 (Warning! Will turn your browser into a potato for a few seconds), these are the image urls that have more than 1000 occurrences in the data set. How to create (requires unshrtn): % twarc.py –query "#JeSuisCharlie" % ~/git/twarc/utils/deduplicate.py JeSuisCharlie-tweets.json > JeSuisCharlie-tweets-deduped.json % cat JeSuisCharlie-tweets-deduped.json | utils/unshorten.py > JeSuisCharlie-tweets-deduped-ushortened.json % ~/git/twarc/utils/image_urls.py JeSuisCharlie-tweets-deduped-ushortened.json >| JeSuisCharlie-20150115-image-urls.txt % cat JeSuisCharlie-20150115-image-urls.txt | sort | uniq -c | sort -rn > JeSuisCharlie-20150115-image-urls-ranked.

Preliminary stats of JeSuisCharlie, JeSuisAhmed, JeSuisJuif, CharlieHebdo

#JeSuisAhmed $ wc -l *json 148479 %23JeSuisAhmed-20150109103430.json 94874 %23JeSuisAhmed-20150109141746.json 5885 %23JeSuisAhmed-20150112092647.json 249238 total $ du -h 2.7G . #JeSuisCharlie $ wc -l *json 3894191 %23JeSuisCharlie-20150109094220.json 1758849 %23JeSuisCharlie-20150109141730.json 226784 %23JeSuisCharlie-20150112092710.json 15 %23JeSuisCharlie-20150112092734.json 5879839 total $ du -h 32G . #JeSuisJuif $ wc -l *json 23694 %23JeSuisJuif-20150109172957.json 50603 %23JeSuisJuif-20150109173104.json 5941 %23JeSuisJuif-20150110003450.json 42237 %23JeSuisJuif-20150112094500.json 5064 %23JeSuisJuif-20150112094648.json 127539 total $ du -h 671M . #CharlieHebdo $ wc -l *json 4444585 %23CharlieHebdo-20150109172713.

Preliminary look at 3,893,553 JeSuisCharlie tweets

Background Last Friday (January 9, 2015) I started capturing #JeSuisAhmed, #JeSuisCharlie, #JeSuisJuif, and #CharlieHebdo with Ed Summers’ twarc. I have about 12 million tweets at the time of writing this, and plan on writing up something a little bit more in-depth in the coming weeks. But for now, some preliminary analysis of #JeSuisCharlie, and if you haven’t seen these two posts (”A Ferguson Twitter Archive”, “On Forgetting and hydration”) by Ed Summers, please do check them out.

anon development visualization

panamapapers images April 4-29, 2016

Dataset is available here. Looking at the #panamapapers capture I’ve been doing we have, 1,424,682 embedded image urls from 3,569,960 tweets. I’m downloading the 1,424,682 images now, and hope to do something similar to what I did with the #elxn42 images. While we’re waiting for the images to download, here are the 10 most tweeted embedded image urls: Tweets Image 1. 10243 2.

York University Libraries Open Access Week 2012 - blogvsbook

Yesterday, York University Libraries held a debate in the Scott Library entitled, "Be it resolved the blog replace the book?" The debate turned out pretty awesome, and somehow the team arguing for the book won!? (Some might say it was because of @adr’s compelling closing statements.)  Along with livestreaming the debate on ustream, I pulled together (a special thanks to Ed Summers, and his very permissive licensing) a little node.js application to display a "twitterfall" of the hashtag for the event.