By Bill Dague, Head of Alternative Data at Nasdaq
Across industries, there is an ever-increasing rate of data sharing for the purposes of collaboration and innovation between organizations and their customers, partners, suppliers, and internal teams. The financial industry is no different in its embrace of data as a key part of its future—in many ways, finance is leading the way.
While the industry has bought in when it comes to the importance of data, the logistics of data sharing and proper data management present significant challenges that are unique to finance.
First, financial (and alternative) data consumers need to establish reliable and scalable ingestion pipelines. In this, they depend largely on vendors to provide them with timely, accurate data in a format that’s easy to use (a proposition which is often fraught with disappointment). Once they have the data, there remains a significant technical burden in processing and running analysis on massive datasets (e.g., tick-level datasets). Then, they are left managing and maintaining data to make sure it stays up-to-date and consistently applying updates to preserve multi-temporality. Finally, there’s the ever-present responsibility of ensuring compliance with complex usage rules and vendor policies relating to accessing and distributing data.
Our clients have been vocal about the impact of these challenges. In many cases, data is getting lost in complex, siloed infrastructure. Without central sharing standards, data discovery, access, and governance become impossible. Expensive data gets locked up, under-utilized, duplicated, and sometimes purchased multiple times. Unless data is properly managed and permissioned, it’s difficult for teams to collaborate and it’s impossible to audit and report on access. Overall, you just get less return on your data investment.
And the industry clearly knows this. A BBH survey of 50 senior executives in global asset management (overseeing more than $18 trillion AUM in aggregate) found that more than half (57%) of the respondents are seeking to improve their data and technology abilities. Respondents cited manual processes, data optimization, and data in general as challenges that they’re looking to overcome as they move forward.
Current solutions aimed at improving data sharing are not open-source or interoperable. Sharing data, especially big data, is difficult and high-friction, even within a single organization. And solution providers in the data management space aren’t necessarily incentivized to be change-makers in this regard, either. Keeping users in “walled gardens” is better for business.
But, as history tells us, the future of data in the financial industry tends towards open protocols and standards (à la spark, pandas, etc.). Data governance, sharing, and management are no exception. As Nasdaq continually seeks out ways to better serve our clients, we’re delighted to announce participation and support, together with the Delta Lake open source community, in launching the new, open-source Delta Sharing protocol, the industry’s first open protocol for secure data sharing.
In particular, I see three main benefits to an open approach to data sharing:
Regardless of the computing platform, Delta Sharing allows for secure data sharing between parties. The protocol employs a vendor neutral governance model. It is an open standard usable by any platform or data vendor, it works cross-cloud, and it integrates with virtually any modern data processing stack (i.e., anything that can read Parquet files).
In addition, there’s no slow or expensive data conversion needed with direct access to cloud-stored Parquet files.
Better manage entitlements and maintain compliance standards
Delta Sharing is an open, efficient, and scalable protocol that allows users to easily share and manage entitlements for external and internal data sharing. It can help make data governance easier—you can manage entitlements, security, masking, and privacy on shared datasets irrespective of the computing platform used to access them.
Unlocking more value from datasets
With easier and more secure sharing thanks to interoperability, built-in authentication, and granular entitlement management, users can share data and compute on it seamlessly. Organizations can reduce the duplicative, tedious work of moving and entitling data, reducing their time to value and allowing them to focus more on their core business.
While data on its own is valuable to organizations, too much value is being left on the table by the industry’s reliance on restrictive tools and legacy sharing paradigms. We are looking forward to working with Databricks and the open-source community on this initiative. With an open protocol, we can give the industry what it needs—and deserves—in order to move forward: an open approach to data sharing.