Around the world, business leaders are turning towards data-driven decision-making in their organizations, and readily accept that the ability to manage data well is vital to help them pull ahead of competitors. This development was years in the making, however.
According to Juan M. Lavista Ferres, the chief data scientist and a vice president at technology giant Microsoft, the methods currently in use are hardly new, while our fascination with data really started decades ago.
With a master’s degree in machine learning and data mining from the John Hopkins University in 2005 – long before data science was coined as a term, Lavista Ferres should know a thing or two about it.
The tipping point for data science
In an interview with Fortune Education, Lavista Ferres notes that data science is the culmination of efforts across multiple fields to work with data. From economists, statisticians, and computer scientists, experts within various disciplines had over the years created various data-centric methods to work with data.
Data science is essentially the amalgamation of these techniques and strategies.
And the methods of manipulating data today are essentially unchanged over the last 100 years, says Lavista Ferres. Even “newer” machine learning algorithms still date back two to three decades.
The simple reason why a lot of these algorithms were not used in the past is due to a lack of data or inadequate processing power, which are considerations that pervasive digitalization and cloud computing have reduced, if not outright removed.
“[A] lot of these algorithms were not used because there was not enough data. Today, we have significantly more processing power and significantly more data that we didn’t have before,” Lavista Ferres said.
“[We are seeing an] exponential growth in data. There’s been a huge reduction in the cost of storing data and a huge amount of increase in processing power. All of these things are expected to continue to grow and provide amazing opportunities that didn’t exist before.”
Building an exceptional data science team
This data bonanza can help solve a huge number of problems, says Lavista Ferres, which explains the shortage of data scientists as companies scramble to hire them.
The barrier to being a data scientist isn’t as high as it is made out to be. He explains: “[In general], you don’t need to have studied data science to be a data scientist… the majority of the data scientists working at Microsoft and in the tech industry today did not particularly study data science, but they did study a discipline that had a lot [of] data science elements.”
While Lavista Ferres says the bulk of a data scientist’s skills is learned on the job, a strong foundation around key elements such as coding or data is vital. Finally, a very strong background in statistics will be extremely helpful to understand what one can (or cannot) do with data.
But how can businesses build an exceptional data science team? It turns out that the best teams are those with a diverse background, says Lavista Ferres: “Currently, on my team, I have economists, statisticians, computer scientists, physicists, people with electrical engineering backgrounds—and there are those coming from new disciplines that are mainly focused on data science and we work with them.”
Unsurprisingly, the ability of a data scientist to communicate well is deemed an essential skill by Lavista Ferres. This entails explaining what the data is saying and putting it in a way that other people can understand.
Data science is for all
If there is one takeaway, it would surely be Lavista Ferres’s belief that data science is for everyone. According to him, data scientists are currently filling in the gaps across a lot of disciplines – from medical doctors to physicists – due to a dearth of data-centric skills.
But as data science capabilities eventually proliferate across organizations and industries, expect data science skillsets to become prevalent.
And much like how coding is now taught in non-IT courses or even as enrichment classes for children, it is not too much of a stretch to imagine that data science would eventually be found everywhere.
Paul Mah is the editor of DSAITrends. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose. You can reach him at [email protected].
Image credit: iStockphoto/lzf