Data Intelligence

Building a Marketplace for Data Mesh: Facilitating Data Product – Part 1

Actian Corporation

May 28, 2024

Over the past decade, data catalogs have emerged as important pillars in the landscape of data-driven initiatives. However, many vendors on the market fall short of expectations with lengthy timelines, complex and costly projects, bureaucratic data governance models, poor user adoption rates, and low-value creation. This discrepancy extends beyond metadata management projects, reflecting a broader failure at the data management level.

Given these shortcomings, a new concept is gaining popularity, the internal marketplace, or what we call the Enterprise Data Marketplace (EDM) at Zeenea.

In this series of articles, get an excerpt from our Practical Guide to Data Mesh where we explain the value of internal data marketplaces for data product production and consumption, how an EDM supports data mesh exploitation on a larger scale, and how they go hand-in-hand with a data catalog solution:

  1. Facilitating data product consumption through metadata
  2. Setting up an enterprise-level marketplace
  3. Feeding the marketplace via domain-specific data catalogs

Before diving into the internal marketplace, let’s quickly go back to the notion of a data product, which we believe is the cornerstone of the data mesh and the first step in transforming data management.

Sharing and Exploiting Data Products Through Metadata

As mentioned in our previous series on data mesh, a data product is a governed, reusable, scalable dataset offering data quality and compliance guarantees to various regulations and internal rules. Note that this definition is quite restrictive – it excludes other types of products such as machine learning algorithms, models, or dashboards.

While these artifacts should be managed as products, they are not data products. They are other types of products, which could be very generally termed “Analytics Products”, of which data products are one subset.

In practice, an operational data product consists of two things:

  • Data – Materialized on a centralized or decentralized data platform, guaranteeing data addressing, interoperability, and access security.
  • Metadata – Providing all the necessary information for sharing and using the data.

Metadata ensures consumers have all the information they need to use the product.

It typically covers the following aspects:

  • Schema – Providing the technical structure of the data product, data classification, samples, and their origin (lineage).
  • Governance – Identifying the product owner(s), its successive versions, its possible deprecation, etc.
  • Semantics – Providing a clear definition of the exposed information, ideally linked to the organization’s business glossary and comprehensive documentation of the data product.
  • Contract – Defining quality guarantees, consumption modalities (protocols and security), potential usage restrictions, redistribution rules, etc.

In the data mesh logic, these metadata are managed by the product team and are deployed according to the same lifecycle as data and pipelines. There remains a fundamental question: where can metadata be deployed?

Using a Data Marketplace to Deploy Metadata

Most organizations already have a metadata management system, usually in the form of a Data Catalog.

But data catalogs, in their current form, have major drawbacks:

  • They don’t always support the notion of a data product – it must be more or less emulated with other concepts.
  • They are complex to use – designed to catalog a large number of assets with sometimes very fine granularity, they often suffer from a lack of adoption beyond centralized data management teams.
  • They mostly impose a rigid and unique organization of data, decided and designed centrally – which fails to reflect the variety of different domains or the organization’s evolution as the data mesh expands.
  • Their search capabilities are often limited, particularly for exploratory aspects – it’s often necessary to know what you’re looking for to be able to find it.
  • The experience they offer sometimes lacks the simplicity users aspire to – search with a few keywords, identify the appropriate data product, and then trigger the operational process of an access request or data delivery.

The internal marketplace, or Enterprise Data Marketplace (EDM) is therefore a new concept gaining popularity in the data mesh circle. Like a general-purpose marketplace, the EDM aims to provide a shopping experience for data consumers. It is thus an essential component to ensure the exploitation of the data mesh on a larger scale – it allows data consumers to have a simple and effective system to search for and access data products from various domains.

In our next article, learn the different ways to set up an internal data marketplace, and how it is essential for data mesh exploitation.

actian avatar logo

About Actian Corporation

Actian makes data easy. Our data platform simplifies how people connect, manage, and analyze data across cloud, hybrid, and on-premises environments. With decades of experience in data management and analytics, Actian delivers high-performance solutions that empower businesses to make data-driven decisions. Actian is recognized by leading analysts and has received industry awards for performance and innovation. Our teams share proven use cases at conferences (e.g., Strata Data) and contribute to open-source projects. On the Actian blog, we cover topics ranging from real-time data ingestion, data analytics, data governance, data management, data quality, data intelligence to AI-driven analytics.