What are the core skills of data scientists today?

With the role seeing increasing demand, what are the traits held by the majority?
24 January 2020

What traits are behind the in-dermand skillset? Source: Shutterstock

As data becomes the fuel of intelligent business decisions, cutting-edge products and services and machine learning applications, there is a growing demand for specialists that know how to handle it. 

A report by recruitment site Indeed last year revealed a boom in demand for data scientists. 

The role has “only grown sexier”, the company said, as more employees than ever before are competing to acquire number-crunching talent. 

The problem is, owed to the diverse range of interpretations for the role across businesses and sectors, it’s proving a challenge to find the right people for the job. Ambiguities surrounding just what skills are required may be hampering organizations’ ability to source talent.

Data scientist staples?

A European-based survey by Big Cloud based on more than 1300 responses from across 10 cities sought to uncover the skills that data science professionals are putting to task day-by-day. 

Comprising responses from data scientists, as well as related roles such as data engineers and certain c-level positions, the nature of respondents themselves shed some light on common background traits.

While more than half of those in the survey holding a Master’s degree, for example, there is also a continued decline in those that opt to study. Roughly a quarter had more than 10 years’ experience, but nearly half had been in their current position for one year or less. 

Meanwhile, the respondents came from industries as diverse as Technology/IT, Consulting, E-Commerce, FinTech, Academia/Education, Software, Healthcare, Insurance and Automotive.

So, what are the most common data science methods and tools in use? 

The most common methods of data science included Logistic Regression, Neural Networks, Random Forests, among 56 percent of respondents, while Gradient Boosted Machines, CNNs also featured in the top seven. Other options featured included Ensemble, Bayesian and SVMs. 

Data science tools used most regularly & data science methods used most regularly. Source: Big Cloud

When it came to tools, though, the vast majority said Python was their primary modelling coding language – a 10 percent increase year-on-year. Around 90 percent of data scientists said this was their most regularly-used tool,  more so than other commonly used tools such as Jupyter Notebook and SQL.

Data scientists typically have coding expertise at data modelling level, while more senior roles, such as Senior of Chief data scientists tend to have more strengths in coding at production level. Data engineers and machine learning specialists were much more likely to be experts at the latter.

Just shy of a third of respondents (29 percent) said coding consumed around 11-20 hours a week, while just 14 percent said they spent 31 percent or more hours coding per week, but this depended on seniority.

Data scientists are using these combinations of statistical modelling methods alongside tools like Python to analyze and identify predictive insights, which can ultimately help businesses make informed decisions from their data. 

Coding ability by job type. Source: Big Cloud

The wealth of experience, and required skills in coding and statistical modelling means data scientists, and various related ranks, can command pretty substantial salaries, roughly US$65K or more.

But it’s worth remembering data scientists aren’t necessarily impactful as standalone hires, and while these skills are core to their work, additional talents are required in data translation, data storytelling, and even behavioral psychology to achieve wider goals. 

Organizations advancing along their AI journey will also identify their need for a visualization designer, who should have information design and UX skills to bring to life the visual intelligence layer of the data insights. 

A machine learning (ML) engineer also plays a key role in data science teams as they package the ML models into an end-to-end application. They use their deep programming skills with a mastery in handling data to automate the entire workflow.

In addition to these well-known roles within data science teams, there are some that fly under the radar, but can arguably be just as important. 

Companies should recognize the importance of data science translators, who act as the bridge between business users and data science engineers by identifying the most impactful projects and business challenges that can be solved by data. 

In fact, McKinsey estimates that demand for translators in the United States alone may reach two to four million by 2026.

Bringing in a behavioral psychologist can help data science teams interpret patterns into actionable insights that power decision-making and deliver business value. 

Behavioral psychologists understand why people behave the way they do and can help data scientists by giving insights into purchase decisions or customer churn.