Every time I hang out with these folks I’m reminded how wonderful it is to find a true fit within a community. As things do: the discussion drifted towards nonemclature: the Department of Statistics has been renamed to the Department of Statistics & Data Science. Should the PhD program change their name as well?

The timing was apt as yesterday a new Unofficial Google Data Science blog post came out which tries to clearly define the Google Data Scientist - Research1 role. In particular they singled out statistical knowledge as a key discriminating feature.

“For statistics, however, the problem is arguably more severe — and increasingly so — with the expansion of the field of data science in recent years. Due to its rising popularity, more and more professionals self-identify as data scientists. As a result, the range of statistical skills that a data scientist may possess has become quite wide. Someone who has earned an advanced degree in statistics or has acquired specialized statistical skills by other means will likely describe themselves as a data scientist today; so too will someone who enjoys working with data but having an extremely limited statistics skill set. Given that the application of data science is itself evolving along with technology, we would not expect the skills required to be a successful data scientist to stay constant over time. But even so, we want to be thoughtful and precise about those requirements, not leave them subject to drift in meaning.

So they provide a quiz: if you score highly then they suggest that you’re the type of person we’re looking for. In particular they write “Individuals who scored highest on the questions are, in our judgment, seen to be very strong performers in the DS-R role.”. And I’d agree2 but would go further.

Inspired by my friendly discussion and the blog post I’ll make a bold claim: statistician needs to become a job title again.

There’s obvious inertia in trying to revive the statistics brand: the public eye is hopelessly anchored to visions of census forms and t-tests. And our public outreach isn’t helping: AP Statistics is a boring waste of time and even sports statistics becomes numbingly focused on ever more arcane numerology. Truly “data scientist” was the sexiest profession of the 2010s and statistics flirted hard but it doesn’t seem like a stable marriage.

Statistics needs to do a rebrand. As a statistician there are problems where I have a competitive advantage: statistical modelling, reasoning about data generating processes, decision making under uncertainty. And there are problems that I can do but would be better handled by a others: machine learning engineers for model training or “actual” data scientists for data mining and standard analysis. Somewhat blasphemously I would argue that a lot of useful data science work can occur without care towards uncertainty quantification. This only makes me quesy because as a statistician I naturally think in terms of variability.

One the other side: data scientists as data scientists don’t need to invent methodology. Their time would be better spent becoming subject area experts. They need to reason and communicate effectively. One of my old bosses held the view that the primary role of a data scientist is social and rarely needs anthing more sophisticated than a sample mean. Sometimes you do need to do methodological work: but it doesn’t make sense to have that be the same person who writes SQL for your dashboards.


  1. The fact that we need a hyphen is a bad start for a clearly defined roles ↩︎

  2. not just because I aced the test, though that doesn’t hurt ↩︎