Organizations are increasing their footprint in the cloud as they modernize their IT infrastructure. Some are going further, and are migrating their on-premises databases to cloud-native services such as Snowflake and Databricks.
But is the cloud a good place for data and data analyses?
The case for the cloud
Despite initial concerns about putting vital IT assets and data in the cloud, the cloud today is generally regarded to be as secure as – or even more secure than – an on-premises system. The cloud also offers various advantages for managing data, particularly in areas such as scalability, ease of use, and simplified data processing.
In addition, migrating data to the cloud can give organizations a fresh start by breaking down data silos to establish a single source of truth, allowing them to respond quickly to opportunities and market changes. It should be no surprise that demand is growing, and the industry has evolved with new capabilities and cloud-native offerings.
For instance, Teradata just two weeks ago launched its VantageCloud Lake cloud solution for driving analytics at scale. Teradata says its solution is highly flexible, allowing companies to deploy analytics on the enterprise level and scale to more than a thousand nodes, as well as small exploratory or departmental workloads.
Despite the various advantages that the cloud offers, there are two potential barriers to bear in mind, however.
Harder than it sounds
While one would imagine a migration to the cloud to be a matter of some configuring of cloud-based systems and setting up some automated tools to run, this is an unlikely scenario for most organizations. For mature enterprises with a substantial amount of data, a migration to the cloud will likely be a complex, involved affair.
Indeed, a recent article on InformationWeek compared a cloud database migration to a heart transplant. The reason? Differences in the SQL implementations found in traditional databases such as triggers and stored procedures will probably have to be re-implemented in a different language.
“When you look at the similar migration in the cloud, many of the databases don’t support the robust SQL packages you may have… it’s really like a heart transplant because you have to look into each piece,” said Ramke Ramakrishnan, a senior director analyst at Gartner.
And even if you could do a lift and shift, it might not make sense to do so. As Ramakrishnan explains, the traditional approach of putting everything into a single database doesn’t make sense in the cloud. Instead, an enterprise might benefit more from using multiple data stores for better performance or scalability in the cloud.
Moreover, enterprises can also leverage various tiers of object storage to reduce data storage costs or implement another form of database such as NoSQL or graph store with negligible increase in cost due to the cloud’s pay-as-you-use model.
In a nutshell, it makes sense to view a data migration to the cloud more as a transformation project than a routine feature update.
Mind your data quality
Finally, simply migrating data to the cloud does not automatically increase its quality; legacy dirty data moved to the cloud will still be dirty and is just as liable to cause business problems.
And though data quality is a challenge for everyone, the problem increases when dealing with data in the cloud, says Angsuman Dutta, the CTO of analytics firm FirstEigen, in a contributed opinion piece on Entrepreneur.
For one, data may also become corrupted during the migration process, or through an incorrect configuration. Dutta pointed to how a Fortune 500 company unwittingly restricted its cloud data warehouse to up to eight decimal points. Should be more than enough, right? Turns out that this caused truncation errors during migration and culminated in a USD50 million reporting issue.
Finally, data quality can be an issue in situations where data from different sources must be merged. According to Dutta, two departments of a pharmaceutical company relied on different units to store inventory information (Think ‘number’ versus ‘packs’). When incorporated into a cloud data warehouse, the data became a nightmare to report and analyze due to this inconsistency.
When all is said and done, migrating your data to the cloud is no walk in the park. To be clear, there are many good reasons to move your data warehouse or data analyses to the cloud. Just don’t expect it to be simple, or for the cloud to magically resolve data quality issues.
Paul Mah is the editor of DSAITrends. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose. You can reach him at [email protected].
Image credit: iStockphoto/Panuwat Sikham