research

My primary research interests are to:

  • develop and use statistical machine learning methods to identify influences on consumer behavior;
  • understand the impact of deployed machine learning methods in real-world markets; and
  • leverage these and other insights to improve the machine learning methods used in industry to increase companies' objectives, individual well-being, and societal welfare.

My main application area of interest is personalized recommendation systems for online platforms. More broadly, I specialize in discrete data; I have worked extensively with unstructured text and logged user actions.

In these applications, it's especially important for models to be interpretable and inference algorithms to be scalable. To these ends, I focus on Bayesian latent variable models, which are well-suited to exploratory data analysis because variables can map to intuitive concepts such as the "topic" of a document or the "influence" of one person on another. These variables can be learned with variational inference techniques, which scale well to large numbers of observations.

If you are interested in learning more about any of these subjects, please see my recommended readings for a list of resources.

Projects

Algorithmic Confounding in Recommendation Systems

Recommendation systems occupy an expanding role in everyday decision making, from choice of movies and household goods to consequential medical and legal decisions. The data used to train and test these systems is algorithmically confounded in that it is the result of a feedback loop between human choices and an existing algorithmic recommendation system. This active project involves exploring the impact of algorithmic confounding in this context.

Nonparametric Deconvolution Models

Decomposition models decompose observations into constituent parts by representing observations as a product between group representations and factor features. With others, I am working on deconvolution models, which similarly decompose, or deconvolve, observations into constituent parts, but also capture group-specific (or local) fluctuations in factor features.

Social Poisson Factorization

The downside to most algorithmic recommendations is that, for some people, part of the appeal of reading, watching, or consuming other media is in creating shared experiences with friends. This work incorporates the ratings of friends (and not just friends' general preferences) in providing personalized recommendations.

Recommending Television for Groups

With collaborators, I performed a large-scale study of television viewing habits, focusing on how individuals adapt their preferences when consuming content with others. We constructed a simple model for estimating how individual preferences are combined in group settings.

Exploring Text Data

Topic modeling is a machine learning method that learns underlying themes in a collection of documents, which can be used to summarize and organize the documents. I have worked on several projects to allow domain experts to explore their corpora through the lens of topic modeling.