Snowflake-technologies-quantyca
Scopri

Overview

Open Data Mesh (ODM) is a platform capable of managing the entire lifecycle of data products, from development, through deployment, to their decommissioning. It was designed and implemented by Quantyca to support clients in adopting a paradigm based on managing data as a product. The platform was released as open source in early 2023.

ODM utilizes a specification (Data Product Descriptor Specification – DPDS) that is used in conjunction with the services of the Open Data Mesh Platform to create, deploy, and manage instances of data products in a modular and composable architecture.

Data Product Descriptor

The Data Product Descriptor is a document that describes all the input information to a data product, including its full name, the owner, the domain it belongs to, its version number, interface components, and all internal components. It is used to share metadata and information of a data product with all consumers within the platform.

Info Object

The Info Object contains general information about the data product. These details can be utilized by consumers within the platform when needed. Among them are the product ID, full name, version, description, domain it belongs to, owner, and contact information.

Interface Components

Interfaces are components exposed to consumers and are grouped by functional role into entities called ports. Each port exposes a service or a set of related services. These are the five types of ports supported by the DPDS:

Input ports
A set of services of a data product that enable the collection of data from the source and make it available for further internal transformations.
Output ports
A set of services exposed by a data product to securely and reliably share generated data.
Discoverability ports
A set of services providing information about the static role of a data product within the architecture, such as purpose, structure, position, etc.
Observability ports
A set of services that provide information about its dynamic behavior within the architecture, such as logs, audit logs, metrics, etc.
Control ports
A set of services that allow configuring local computational policies and performing privileged governance operations.

 

Internal Components

These are all the elements of a data product that implement the services exposed by the ports, such as pipelines, microservices, etc. In the DPDS, an application component is primarily described through parameterized models that formally define how to build and deploy the component itself.

These are all the elements of a data product related to infrastructural resources, such as computing and storage resources, useful for executing the application components. In the DPDS, an infrastructural component is primarily described through a parameterized model that formally defines how to provision the specific infrastructural component.

It contains crucial information regarding the lifecycle of data products, focusing exclusively on CI/CD activities, such as deployment in development environment, deployment in production, and decommissioning. All operations necessary to complete these activities are delegated to external DevOps tools.

The platform

The Open Data Mesh Platform (ODMP) is the open-source implementation of a Data Developer Platform (DDP) that facilitates end-to-end management of the data product lifecycle. The platform’s high versatility allows users to use it as-is or customize it to meet specific requirements. Through a modular architecture, ODMP leverages open specifications and protocols to enhance interoperability with different systems: adhering to established standards fosters an ecosystem of tools that integrate and adapt to continuously evolving needs.

 

ODMP simplifies DataOps activities, enabling teams to create, validate, deploy, and evolve their products in a self-service manner. The platform distinguishes itself by being technologically agnostic, offering the flexibility to integrate users’ preferred tools through plug-and-play adapters.

 

The architecture consists of two planes that reflect those proposed by the Data Mesh theory:

Product Plane: This is the ODM implementation of the Data Product Experience Plane, which aids in the creation and consumption of data products as well as in managing their lifecycle.

Utility Plane: This is the ODM implementation of the Data Infrastructure (Utility) Plane, aiming to separate the management of data products from the underlying physical infrastructure.

As evident, the primary modules of the ODM platform are technology-agnostic: data products are created and managed through the Product Plane independently of the underlying physical infrastructure, which is separated by the Utility Plane and adapters.

Product Plane

The Product Plane exposes the core microservices of the ODM Platform. Each microservice provides a set of APIs to handle the stages of the data product lifecycle:

Blueprint
Manages the initiation and initialization of a data product by deeply integrating with Git services.
Registry
Registers a new data product with a unique identifier and version within the mesh, making it visible to governance processes.
Policy
Manages services to apply and enforce computational policies on each data product.
DevOps
Manages the entire data product lifecycle (e.g., development, testing, deployment, decommissioning).
Notification
Manages listeners and sends them notifications when specific events occur.
Params
A custom microservice that helps manage parameters and variables common across the entire platform.

Utility Plane

The Utility Plane exposes a set of services useful for the functionality of the mesh to decouple the underlying infrastructure, such as:

Executors
Serve as intermediaries between the mesh platform and specific DevOps tools.
Observers
Gather and react to events occurring within the platform.
Validators
Are services dedicated to evaluating and executing computational policies.

Each microservice exposes an interface that requires a real implementation. The purpose is to provide a starting point and a basic structure for implementing specific services. Any implementation of the services in the Utility Plane is called an Adapter.

Adapter

An Adapter is a specific implementation of the Utility Plane services technology: it represents in all respects the component that decouples the mesh platform from the real infrastructure. ODM users can connect the adapters needed to work with the underlying infrastructure via the application.

The ODM Platform offers an implementation of the following Adapters:

Azure DevOps Executor
An executor capable of integrating with Azure DevOps APIs to create, test, and deploy data products.
Blindata Observer
An observer that forwards notifications to Blindata in response to events.
OPA Validator
A specific implementation of a validator that uses Open Policy Agent as the engine to validate computational policies.

Use Cases

Need personalised advice? Contact us to find the best solution!

This field is for validation purposes and should be left unchanged.

Join the Quantyca team, let's be a team!

We are always looking for talented people to join the team, discover all our open positions.

SEE ALL VACANCIES