The Role of Data Scientists & Data Strategists
Across Europe, companies are in the process of building up their data science teams. Establishing such teams is not straightforward, chiefly for two reasons:
(1) finding people with the right skill set and (2) creating effective teams, which work together in a highly collaborative fashion. Within inhouse data science units, various distinct roles are required – that need to be clearly defined. To make data science projects succeed, a wide array of skills and work styles are required that go beyond even the most well-rounded of data science profiles.
As a data science consultancy, we also lived through the struggle of defining internal roles. After two years of experiments, we have come up with a two-discipline approach: (1) data scientists and (2) data strategists. Why this distinction, what is a data strategist, and how do both disciplines work together?
Data Scientists are all-rounders, but have individual strengths
Drew Conway’s famous venn diagram describes the skill set of a data scientist as mixture of hacking skills, math & statistics knowledge and substantive expertise. Conway’s diagram is certainly helpful when thinking about data scientists – but it does not bring to the table all skills required to make data science projects succeed. A productive data science team must cover more ground than conway’s diagram.
What is needed? Three things, in essence: (a) a non-stop and rigorous big picture & conceptual view, (b) excellent client-facing communication skills and (c) a passion for painstakingly untangling the idiosyncrasies of the application domain. We have found that these talents are found rarely in (excellent) data scientists. What is more, taking care of the big picture – and being the gateway to the client and her domain requires an entirely different way of working. Constant interruptions are the norm, which non-syncable with data science work, which often requires methodological deep dives and focused work. Hence, the role of the data strategists, which focuses on dealing with client issues and allows the data scientists to work productively.
In any case, to understand the role of competitive advantage and of communication skills, it helps to look at the regular setup of data science projects.
Understanding data science projects helps to clarify roles
The need for data strategists would not be as severe if projects arrived in an orderly, fully specified manner. The reality is far from this.
Very rarely do companies approach us (or the respective internal data science team) with a clearly scoped project, which just needs to be executed. On the contrary, there is usually a vague idea about data potential, use cases and business relevance. Consequently, the initial stages of any project require a lot of communication, brainstorming and scoping workshops to arrive at a project description, which is suitable for a rapid algorithmic proof of concept. Goals and motivation oftentimes shift throughout the project, oftentimes steered by company-internal politics. Navigating these interests and keeping the project on track is the essential role of the data strategist.
In rapid proof-of-concept projects, we split the tasks into five distinct areas: (1) business understanding, (2) data understanding, (3) data preparation, (4) modelling and (5) evaluation.
Overview: Rapid proof-of-concept project
As the graph illustrates, the first few weeks are solely dedicated towards enhancing the understanding of the business and the problem, as well as the data sources, which are available to tackle the issue. Once cleaned and integrated data is available, the work shifts over to more core data science project work, such as as modelling and algorithmic work. While communication plays a large role, it is gradually “phased out” only to return in the later stages of the project, when it comes to engaging with the client regarding results, recommendations and a potential roadmap.
Data Scientist vs Data Strategist
Having the timeline and project activities of any data science project in mind, helped us to develop a framework for different roles within our team. We stick to the regular job description of the data scientists, but also hire for the position of “data strategist”. What’s the difference?
The data science skill set can be well described by the magic intersection in Conway’s venn diagram: math/statistics, hacking/programming, substantive expertise. In contrast, a data strategist, has basic programming skills and a thorough conceptual understanding of statistics and math, but his / her advantage is located in the area of conceptual thinking, communication and project strategy. In the initial phase of the project, a data strategist helps to scope ideas and clarify use cases. Given the background in statistics and programming, the quantitative lens allows for a on-spot assessment of project ideas. Similarly, as large projects oftentimes require data sources to be acquired from various company departments, data strategists help to facilitate the process and are the permanent interface for the client.
As any additional communication layer – in this case between data strategists and data scientists – theoretically allows for friction, it is in this case feasible as the split of tasks allows for a significant leverage of advantages on both sides. Data scientists are able to focus on their deep analytical work, while data strategists are the interface towards the client (internal or external) and “translate” project requirements into manageable data science project components. Throughout the course of any project, data scientists and data strategists collaborate on a continuous basis, but the intensity of work for both professions varies depending on the stage of the project. In the beginning, it is data strategists doing a significant share of the work, then handing over to data scientists at later stages.
Such division of labor has allowed us to not only contribute to employee happiness and satisfaction (as everyone focusses on what they are best in and feels excited about doing), but also ensures client satisfaction, as both communication as well as technical execution are exceeding expectation.