My PhD is focused on developing and deploying the next wave of statistical methods to exploit the increasing availability of fine-grained data. I have done so by working on a range of applied problems in a wide range of contexts. These projects have led to academic publications as well as to societal and business impact.

Here are some things I’ve worked on:

Leading data science projects

I lead multiple data analysis projects from inception to successful completion in collaboration with a major English police force. Working closely with a team of five, I was responsible every step of the way: I developed data quality checks to ensure the unproblematic integration of high-dimensional datasets from partially outdated databases. I then developed parametric and machine learning models to produce granular insights and summary outputs for various audiences. The work entailed negotiating and navigating conflicting objectives with different groups within the force as well as with external stakeholders.

Applying insights from decision theory

During my PhD I worked on problems from a range of applications and fields, from public policy to consumer transaction data, to biotechnological security issues. In all of these projects, it was crucial to bring robust theory from behavioural science to the initial project design stage. For example, if we expected heterogeneity in behaviour, we must identify the right granularity of analysis and build models that can accommodate this key feature. My dual training uniquely qualifies me to apply emerging theory to practical problems.

Robust Bayesian inference for Machine Learning

Statistical models are almost always wrong, especially when the problems we are modeling become more complex. But we want the conclusions we draw from our models to be robust to not knowing everything. In this project, I’ve built on research on robust divergences where parameter estimation is not as sensitive to model misspecification. My collaborator Jeremias Knoblauch and I demonstrate the usefulness of robustifying standard regression models and neural networks on a wide range of classification and count data problems.

Statistical modeling of complex spatiotemporal data

Domestic abuse is a complex phenomenon which clusters in specific locations, for example in deprived areas, and specific times, for example around the holidays. Modeling this kind of data can be tricky, but a further issue here is that the reporting of domestic abuse to police fluctuates heavily and is poorly understood. Together with my collaborator Jan Povala, I work on using self-exciting point processes to examine if and how the reporting of domestic abuse spreads through neighborhoods.

Communicating statistical issues to lay audiences

Working in applied contexts, I’m often asked to explain specific statistical issues to practitioners. Sometimes that involves summarizing research findings and translating them into actionable insights. Sometimes I also write briefs on how to think about commonly used summary statistics.

Example Talk

An example talk using Wowchemy's Markdown slides feature.