The swelling demand for data scientists coupled with the evident skills gap has implications for the global economy as well as the tech industry. What’s causing it, and what can be done to address it?
Most people working in STEM don’t need to be told that data science is a fast-growing and hugely lucrative enterprise, largely due to the estimated 20,000-fold leap in data volumes between 2000 and 2020.
If data is ‘the new oil’, then the data scientist functions much like an oil refinery, converting data into insights that can both save money and generate capital.
The International Data Corporation (IDC) predicts that worldwide revenues for big data and business analytics will reach more than $210bn in 2020. Having finally penetrated the mainstream, data analytics is now a massive priority for executives in top companies.
Data scientist, in turn, is now being touted as the hottest career to get into, even being dubbed the “sexiest job of the 21st century” by Harvard Business Review. Life is pretty good, job-speaking, for the data scientist, with opportunities aplenty and the constant promise of massive compensation.
There’s just one problem: these conditions are created by the basic economic principle of supply and demand.
Demand is high but, crucially, supply is low. While this works out well for data science professionals, it could be ruinous for the economy if not addressed.
The data science skills gap
In 2017, Burning Glass Technologies, Business-Higher Education Forum and IBM came together to produce a report on the demand for data science skills. It forecast that the number of jobs for all data openings will increase by 364,000 by 2020, bringing the total to 2,727,000.
Industries such as finance, insurance, professional services and IT are the ones most desperately seeking these skills, accounting for 59pc of the total job demand.
It is the issues associated with recruiting for data science positions that raise the most disquieting concerns for the economy at large. Data science positions, according to this report, take 45 days to fill – five days longer than the US average of 40 days.
While this may appear to be a relatively insignificant difference, this increase in wait time can translate into delays on large projects, which normally results in companies haemorrhaging funds as they are left sitting on their hands, unable to progress.
Even when the positions in data science do get filled, the costs don’t cease there. For various reasons, including the aforementioned hefty salaries, there is a high cost associated with hiring these professionals. Indeed, the most qualified candidates may still require additional training to get to the point where they can fulfil a particular role for a company.
Is data science scarcity artificially engineered?
Another interesting insight from the IBM report is that 39pc of job listings for data science positions require a master’s or PhD.
It is estimated that as few as one in four data scientists have a PhD qualification.
This obviously contributes to the high price tag associated with these kinds of professionals, and increases demand by narrowing the already small talent pool, thus causing problems for employers.
Vin Vashishta, founder and chief data scientist at V-Squared Data Strategy Consulting, argues that a large part of the reason data science roles go unfilled is that executives write job ads that are too qualification-intensive.
Writing for Fast Company, Vashishta argued: “Google doesn’t require a PhD to be a machine-learning engineer … yet I still see advanced degree requirements on the vast majority of data science and machine-learning job descriptions. Most companies just throw it in unthinkingly but, unless they’re investing heavily in advanced research, it’s pointless.”
So, it’s possible that a part of the reason there is a perceived dearth of qualified data scientists is that there is a widespread misunderstanding on the kind of skills firms require to achieve their goals.
What can be done?
Vashishta further argued in a subsequent LinkedIn post that any of the educational stopgaps deployed to address the data science deficit, such as boot camps and education accelerators, produce “inconsistent” results.
Some may allay their fears by reminding themselves that the demand for data scientists will incentivise entry into the field, which will help to control the situation. The problem, Vashishta explains, is the complex process of training a data scientist.
“There are a lot of recent high-school graduates who have the talent and inclination to enter the field. It’ll be about six to eight years before they’re ready to apply for their first data science job.”
The metaphorical finish line for data science training is also constantly being moved further and further away from your would-be data scientist because of how quickly the field is evolving.
The tech industry has come to an impasse with regards to data science, and it’s difficult to see how it will pan out. In all likelihood, the gap will be filled as tech talent gaps have been filled before but, in an increasingly data-oriented world, the interim period before this gap is overcome could wreak havoc.