Recommended Reading

A list of recommended papers and books on a variety of academic topics.

Data preprocessing tutorial

Given for an internal group meeting; contains lots of links. Focuses on topic modeling, but covers a bit for recommendation and social networks.

Common First Names

Generated from Social Security data. Useful for curating a topic model vocabulary.


Java library for recommendation systems. Has links to relevant datasets.

Research That Matters

A single page to inspire significant research in applied machine learning.

Omit Needless Words

A single page to encourage brevity in writing.

Quant Marketing Call for Papers

A list of CFP maintained by Ron Berman.

Quant Women in Marketing & Women in Machine Learning

Directories to help you find invited speakers, etc.

BBVI for gammas

This document walks through black box variational inference on a very simple model that includes latent gamma-distributed variables. It includes tricks shared by Rajesh Ranganath and code so that readers can replicate the results.

Public Datasets

Wide range of topics including biology, finance, NLP, images, sports, ...

Amazon Product Data

Product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014.

Log Your Shell History

For anyone who works on the command line, please do yourself a favor and log your shell history.

The Pomodoro Technique

This time management method helps me maintain a consistent level of productivity.

Paul Tol's colorblind-friendly color schemes

Also available as an R package for use with ggplot2.