Analysis Category
How To Mine Your GMail with Google Takeout and MongoDB
Posted on February 14, 2014 Leave a Comment

Google has really been on the up-and-up lately with a service called Google Takeout that allows you to export your data from its cloud. For the thoughtful cloud user who is becoming increasingly concerned about privacy, accidental data loss, or data ownership, this is a product that’s sure to please. Likewise, for the data mining enthusiast, quantified-self […]
Understanding the Reaction to Amazon Prime Air (Or: Tapping Twitter’s Firehose for Fun and Profit with pandas)
Posted on December 19, 2013 2 Comments

On Cyber Monday eve, Jeff Bezos appeared in a 60 Minutes segment and revealed to the world that he’s been working on an experimental effort called Amazon Prime Air. The general idea behind Amazon Prime Air is that Amazon may one day deliver relatively lightweight items directly to your doorstep in less than 30 minutes […]
What Do Tim O’Reilly, Lady Gaga, and Marissa Mayer All Have In Common?
Posted on November 22, 2013 4 Comments

This post examines the followers of some popular Twitter users as the final installment of a multi-part series about exploring Twitter influence by asking the (Freakonomics-inspired) question, What do Tim O’Reilly, Lady Gaga, and Marissa Mayer all have in common? Although it may initially seem like an obnoxious question to ask, some of the answers may intrigue you […]
Mining the Social Web Like a Pro: Four Steps to Success [Slides]
Posted on September 20, 2013 2 Comments

[Update – 8 October 2013: The data journalism team at La Nación expanded upon the analysis presented in the slides and put together a really nice article that tells a story about the data. Definitely check it out, and if you don’t read Spanish, try translating with Chrome or paste the URL into Google Translate.] I […]
Arriving at a Base Influence Metric (Computing Twitter Influence, Part 1)
Posted on September 19, 2013 2 Comments

This post introduces a series that explores the problem of approximating a Twitter account’s influence. With the ubiquity of social media and its effects on everything from how we shop to how we vote at the polls, it’s critical that we be able to employ reasonably accurate and well-understood measurements for approximating influence from social media […]
Surprising Stats From Mining One Million Tweets About #Syria
Posted on September 9, 2013 12 Comments

I’ve been filtering Twitter’s firehose for tweets about “#Syria” for about the past week in order to accumulate a sizable volume of data about an important current event. As of Friday, I noticed that the tally has surpassed one million tweets, so it seemed to be a good time to apply some techniques from Mining […]