Data scientists are in high demand, according to the 2020 Emerging Jobs Reports published by LinkedIn. And as organizations get overwhelmed with data, they are looking for data scientists to correlate data points and generate meaningful, actionable insights in today’s complex and highly fluid business environments.
But what are some characteristics of a good data scientist? This question is an evolving one that can be frustratingly difficult to answer given the fluid and large spectrum of roles that a data scientist is expected to perform.
Aside from prerequisites such as knowledge of statistics and applied math, we look at a handful of skills that a good data scientist should have.
Coding skills
The core tenet of data science revolves around building a data model through software scripts. This makes programming a fundamental skill needed to get started with many of the tasks required of the data scientist, which ranges from collating data, performing statistical analysis, visual analytics, and machine learning.
While existing scripts can be easily reused or tweaked, a poor grasp of programming can result in badly modified code that can fail to run or introduce erroneous results – as any experienced programmer will know.
Indeed, even if a given code produces the right result, poor code also has consequences that can impact their scalability or consume far more resources – and taking a much longer time to complete – than is necessary. On this front, R and Python are two of the most widely used languages in data science for various data processing tasks.
Familiarity with SQL
Love it or hate it, Structured Query Language (SQL) will likely remain an important part of the skills repertoire for data scientists for the foreseeable future. Data scientists need to retrieve data for analysis, and much of the data within the enterprise remains in databases.
SQL is a language that allows for the retrieving of data from them. In fact, some basic data processing can be performed efficiently within SQL using embedded scripts or table joins.
An understanding of the basics of Database Management Systems (DBMS) such as MySQL, Oracle, DB2, Microsoft SQL Server and PostgreSQL will probably offer an added advantage for a job in data science.
Effective business communication
For all the brilliance of a data scientist in algorithms and modeling, all these would be for naught if the data scientist isn’t able to communicate their results. Aside from data visualization tools, this rests upon the ability of the data scientist to communicate the significance of his data findings.
Indeed, effective business communication is one of the most important skills for data scientists that are often overlooked. Specifically, a data scientist needs to be persuasive, whether it comes to probing stakeholders for their challenges, or when it comes to communicating analytical insights and solutions in a concise yet clear manner.
Ability to solve business problems
While expertise in programming languages and business communication are vital, they don’t mean much without an awareness of the business problems faced by the organization and how to solve them. This starts with the ability to translate business requirements into a data-driven problem that can be solved as part of a data science project.
It is worth noting that while some business challenges will be surfaced by company executives and departmental heads, the onus is on data scientists to identify real and present issues that may be unidentified or missed out even by stakeholders.
Finally, when it comes to dealing with people, data scientists will do well to remain cognizant of interpersonal sensitivities and inter-departmental rivalries. Mastering these aspects of the corporate environment can go a long way towards the successful completion of a data science initiative, instead of it being mired in endless bureaucratic delays.
Photo credit: iStockphoto/PhotoGraphyKM