What is a Data Mesh?

SaaS data shapes and graphics over the image of a woman in tech

A data mesh is a decentralized IT architecture that delegates ownership of data assets in a business to the departments and teams that are the domain experts for their data. The technology provides the tools needed to allow domain experts to publish their own data and the connectivity tools required to access data products others publish. The data mesh uses a federated data model with specialist business domains as data publishers for others in the business.

Why Use This Architecture?

The problem with traditional centralized IT-managed data warehouses or data lakes is that they rely on a central team who are not experts in all domains. The benefit of a data mesh is that it delegates the responsibility for publishing data to domain experts. Sales and finance functions understand their respective datasets best. They need the tools from IT to enable them to curate and publish their data as a service so the whole organization can benefit from high-quality, accurate data from an authoritative source.

Traditional data warehouses and data marts can create siloes of data that are used in isolation by the business department or line of business they serve. The problem with this approach is that it encourages the proliferation of unconnected data pools that the rest of the business cannot leverage. The data mesh discourages duplication of data, focusing resources on fewer data sources of a higher quality because experts in that data own its maintenance.

It operates a universal interoperability bus into which the various business domains plug. Departmental data warehouse publishes its data-as-a-product using the common interoperability bus.

The main difference between a data fabric and a data mesh is that the data fabric does not distribute data ownership which has the downside of reliance on a central team that can become backlogged.

Discoverability is an essential benefit of a data mesh. Data consumers can quickly locate the data they need thanks to the abundant use of metadata in a data mesh.

The Building Blocks of a Data Mesh

The critical components include:

  • Data sources that could be traditional data warehouses.
  • Domain-specific data-as-a-service data products.
  • Data infrastructure, such as data stores and scripts, to build and instantiate a data product service.
  • Data governance standards and rules.
  • Security controls and policies.
  • Event streaming platforms such as Kafka or Confluent Cloud can be part of the data mesh infrastructure to capture and distribute real-time changes to data.
  • Data quality and metadata conventions.
  • Code – including data pipelines, governance controls, policies and application interfaces.

Benefits of Data Mesh Architecture

The benefits of a data mesh include the following:

  • Domain experts share more meaningful data as a data product service.
  • The business gets more value from existing data sources by sharing them.
  • Decentralization of data management efforts cuts centralized labor costs.
  • Security can enforce policies such as data encryption at rest and in motion.
  • Data is easier to find thanks to metadata.
  • Better self-service-oriented data products.
  • Less data duplication.
  • Fewer data siloes.
  • Data projects can be set up faster as there is less data to move and transform.
  • Shared tools, standards, and processes increase data literacy across the organization.
  • Fewer central IT backlogs for data warehouse projects thanks to the democratization of data.
  • Modular data product services are easier to consume by applications.
  • Improved standardization of data quality and data governance practices
  • Businesses get more value from their data assets that improve data-driven decision-making.

Actian Supports Data Mesh Deployments

The Actian Data Platform can support multiple data stores that a data mesh can share. The platform instances can be hosted on-premise or on multiple cloud platforms. The Actian Data Platform has hundreds of prebuilt connectors to sources, including NetSuite, Salesforce and ServiceNow. It is optimized for high-speed query responses thanks to its vectorized columnar database that outperforms alternatives. The Actian Data Platform is ideal for staging data before being published as a data product within a domain.