Data Scientists Can Stop Juggling Data Stores Soon

There is a huge blind spot in data science: shared data asset management.

In a single, homogenous, connected and localized environment, it does not exist. But no company, including startups these days, is an island; they tend to work with data across stores.

Data is also becoming a valuable business asset that is becoming difficult to move and govern. With data sitting across an organization’s various stores, shared data asset management (also loosely referred to as metadata management, master data management or reference data management) becomes important.

A lack of integrated shared data asset management capabilities means data scientists need to spend a lot of time on data governance.

New regulations and laws are also making it harder for them to move data easily for analysis, while tool vendors are offering some features with difficult licensing models.

“Customer data is a good example. Customers have customer data all over the place in multiple systems which prevents them from having one vision of the customer. We had some capabilities in data management, but we really wanted to strengthen this aspect of data management by either developing something or making an acquisition with the leader,” said Thomas Been, chief marketing officer, TIBCO Software Inc.

Buying Capabilities
Been’s company chose the latter, acquiring Orchestra Networks recently and continuing a route they took since buying Alpine Data and Statistica in 2017.

“The vision was really to onboard the capabilities to manage data as an asset as part of the platform. So, businesses can really understand who their customers are and understand where are their assets,” said Been.

An IDC report (paywall) about the acquisition was bullish about the Orchestra Networks buy. It noted that it will speed up TIBCO’s abilities in “metadata management and multi-domain data mastering.”

The acquisition also provides an important “bridge” between TIBCO’s two portfolio groups covering integration and analytics, while strengthening TIBCO MDM’s features with Orchestra Networks’ EBX.

Sharing for Success

Been noted that the need for shared data asset management will only grow, which is why TIBCO is betting big on addressing it.

“The need for shared data asset management is massive. There is nothing more infuriating than having the digital experience [for data scientists] broken down because a name was misspelled. This is one example. Another is compliance [like GDPR] where managing shared data assets is a business requirement,” said Been.

He also noted that insurance firms are now asking for better data management and governance as part of their underwriting requirements.

More importantly, shared data asset management will help data scientists to prepare for AI and IoT. For example, Been noted that it can help to decide which data to train machine learning algorithms.

“If you give [the algorithm] the wrong data, you are going to have the wrong consequences. Shared data asset management becomes critical for applications like AI and IoT, which needs a good awareness of not only contextual data but also the master data,” he said.

However, TIBCO has its work cut out. It now needs to rationalize its various product acquisitions into a single data science platform that makes business sense.

As IDC noted in its report, “TIBCO needs to focus on integration towards a unified platform and provide an exponential model of value for customers.”

DV: The Bigger Play

Been noted that the larger focus will be on data virtualization. It uses APIs to connect data in different sources, as opposed to relying on the older, expensive ETL (extract, transform and load) processes.

“We are extremely excited by the synergies of shared data management with governance because it adds to the notion of data virtualization. We are also excited about the synergies of shared data asset management with API management,” he said.  

He sees DV becoming a platform for future businesses while helping to shape outcomes of many digital transformation initiatives. 

“If you have the right product for shared data asset management and the right channel through API management, you have the foundations for a pretty good business model. There is also the scenario of real-time integration that goes with shared data asset management. These synergies will, of course, change the landscape in terms of the data management capabilities for digital transformation,” he said.  

Thomas Been, chief marketing officer, TIBCO Software Inc. sees shared data asset management as a huge challenge for data scientists -- an area his company investing heavily into.