Archives
How To Mine Your GMail with Google Takeout and MongoDB
Posted on February 14, 2014 Leave a Comment

Google has really been on the up-and-up lately with a service called Google Takeout that allows you to export your data from its cloud. For the thoughtful cloud user who is becoming increasingly concerned about privacy, accidental data loss, or data ownership, this is a product that’s sure to please. Likewise, for the data mining enthusiast, quantified-self […]
Mining Social Web APIs with IPython Notebook [Data Day Texas Workshop Slides]
Posted on January 12, 2014 Leave a Comment

Thanks to everyone who attended the Mining Social Web APIs with IPython Notebook workshop at Data Day Texas. I’m really glad that I made the trip down to Austin and could share some of my work with you. The data truly is bigger in Texas, Austin was a fantastic city to visit, and everyone I had […]
How To Harvest Millions of Twitter Profiles Without Violating the ToS (Computing Twitter Influence, Part 3)
Posted on October 22, 2013 1 Comment

In the last post in this continuing series on computing Twitter influence, we developed a wrapper function called make_twitter_request that handles the various sorts of HTTP error codes and network failures that you are likely to experience as you aspire to acquire non-trivial amounts of data from Twitter’s API. Although you are somewhat unlikely to […]
Why Is Twitter All the Rage?
Posted on October 9, 2013 4 Comments

Next week, I’ll be presenting a short webcast entitled Why Twitter Is All the Rage: A Data Miner’s Perspective that is loosely adapted from material that appears early in Mining the Social Web (2nd Ed). Given that the webcast is now less than a week away, I wanted to share out the content that inspired the topic. This […]