Is your organization still testing new data analyst or data engineer hires with tests about their SQL competency? Perhaps organizations should consider replacing them with alternative ways of testing their competency with relational data.
The suggestion was made by Randy Au, a UX researcher and former data analyst, who shared his rationale in a blog post on Substack, drawing from his years of experience giving and taking a large number of SQL-related interviews.
According to Au, the most used part of SQL is quickly picked up, while its various dialects and nuances are so different that testing can’t go too far. Moreover, the pervasiveness of SQL has spawned a plethora of “SQL-like” tools that require an understanding of relational concepts – but which might not necessarily entail managing a relational database.
Picking up basic SQL is easy
“It seems that so long as people have some experience working with tables of data, whether in Excel, Python, or R, AND they have someone looking over their shoulder and giving advice for a couple of hours, they can learn SQL on the job,” Au explained.
“A lot of the difficulty of learning SQL on your own has more to do with needing to have access to a functional database with real data inside — take away that hurdle and most people can do it.”
To illustrate how specific SQL implementations can trip even experienced analysts up, Au shared an anecdote of a job hunt where he was given a problem to solve that entailed using window functions found in Postgres.
Unfortunately, Au was not familiar with window functions – he was experienced in MySQL which did not offer window functions at that point. As a result, he crafted a “massive query” which horrified the interviewee and presumably cost him a job offer.
Test for data manipulation skills instead
Where database knowledge is sought, organizations should instead test for the ability to work with relational data with understanding about designing tables and normalizing data, he suggested.
To test for data skills, organizations can consider having candidates manipulate and design data sets in whatever way is most comfortable, including via a spreadsheet. Data can be easily loaded via CSV files if necessary, and demonstrate competence by their ability to join, filter, and aggregate data to deliver the desired outcomes.
“[Unless] you absolutely require experience with a particular dialect of SQL, you inevitably need to constrain the sorts of questions you can ask or risk rejecting people who just happen to be masters of a completely different dialect,” Au summed up.
“It’s very difficult to ask a query optimization question that isn’t completely trivial, while also avoiding deep implementation details of particular databases… Trying too hard to equate ‘knows the SQL dialect we use’ with ‘can do data analysis that touches SQL’ is going to wind up trapping you with a smaller field of candidates.”
Image credit: iStockphoto/fizkes