Data Mesh 101: Why Federated Data Governance Is the Secret Sauce of Data Innovation

11 Oct

Data Mesh 101: Why Federated Data Governance Is the Secret Sauce of Data Innovation

TA

Tareq Abedrabbo

What makes data mesh such a powerful concept is the principle of federated data governance.

‍

The big shift that data mesh enables is being able to decentralise data, organising it instead along domain-driven lines, with each domain owning its own data that it treats as a product that is consumed by the rest of the organisation.

‍

The process of decentralising, democratising and productising data is a quantum leap in enterprise data architecture that opens the door to massive experimentation and innovation.

‍

But you can’t just decentralise everything and wait for innovation to occur, there would be chaos.

‍

The secret sauce is using a federated approach to strike a balance between decentralised data sources (that enables innovation at scale) and centralised data governance (that provides the basis for consistency and collaboration across the organisation).

What Is Data Federation?

Federated data governance in a data mesh describes a situation in which data governance standards are defined centrally, but local domain teams have the autonomy and resources to execute these standards however is most appropriate for their particular environment.

‍

In this federated governance model, autonomous data domain teams and centralised data governance functions collaborate in order to best meet the data needs of the whole organisation. This collaboration is arranged by the centralised data governance leader, ensuring a cohesive approach across the board.

‍

In this way, teams can “shift left” the implementation of data governance policies and requirements in order to embed them into their data products early in the development lifecycle, effectively initiating a data governance program. This application is an essential aspect of federated computational governance, seeking to improve efficiency and compliance throughout the data lifecycle.

‍

What might this look like in a data mesh?

‍

The data is decentralised, with each domain taking ownership of its own data from end-to-end. This means that each team can scale their own processes without impacting other teams and domains.

‍

Consumers, however, are likely to require data from multiple domains so the different domain data needs to have a very high degree of interoperability so consumers can easily incorporate a variety of datasets from across the business.

‍

So each domain, in order to be part of the mesh, must follow a set of centrally-managed guidelines and standards that determine how their domain data will be categorised, managed, discovered and accessed. This covers things like data contracts, schemas and so on.

‍

This also includes a shared data infrastructure layer that domains can draw on to build their own pipelines from pre-approved templates that ensure security and compliance (and avoid the duplication of each building their own infrastructure from scratch).

‍

This is where the centralised governance comes in, establishing data management practices and processes that ensure that the data provided by each domain is of the highest quality, from a consumer perspective.

‍

Why Data Federation Is a Superpower

‍

Federated data governance enables autonomous databases to function within a unified framework while maintaining their independence. This ensures coordinated data management, enhancing data consistency and accessibility across diverse systems. Unlike decentralised models, which distribute control without central authority, federated governance supports seamless collaboration and efficient operations.

There are a few key reasons why data federation is so impactful.

‍

Maintain independence, autonomy and accountability

‍

The main benefit is that domains can operate with a high degree of autonomy.

‍

They know their own domain far better than anyone else and are best placed to decide exactly how they should manage their data and how they can best scale.

‍

This level of independence also ensures a high degree of accountability because a single team follows a given data product from production to consumption.

‍

The result is high-quality data products that can be produced in a scalable and resilient fashion by teams that know their own domain intimately and are responsible for end-to-end delivery.

‍

Enable interdependence and collaboration across domains

‍

However, the data products that domains produce still need to be usable by the consumer.

‍

There must be a minimum degree of interdependence between domains, which is why having centrally-governed standards is so critical.

‍

Issues that affect all domains need to be subject to a wider authority—perhaps even a team of domain product owners—to ensure that domains are consistent in how they handle and process data.

‍

In a data mesh, data is viewed as a product, so we can draw inspiration from how product development is done in large organisations: ideally, there are certain centrally-governed development guardrails that are baked into architecture and how people work, within which developers are free to innovate as they wish.

‍

Data mesh can be set up similarly, with a team of experts responsible for curating and providing the interoperability ‘guardrails’ within which domains can operate however they see fit.

‍

Govern and consume data wherever it is

‍

When domains are functioning in ways that are both independent but interoperable it is possible to govern data with great effectiveness, wherever it is in an organisation.

‍

Domains take care of the local processes and concerns, with a central team ensuring minimum standards for consistency and accessibility.

‍

Data that is effectively governed in this way is a delight for consumers. They can get on with their work knowing that high-quality, highly-discoverable data is on tap and can be plugged into their projects when needed.

‍

Running around different teams trying to find if a particular data set exists or not or whether it can be transformed to meet your needs becomes a thing of the past.

‍

Enables massive scalability

‍

When you have a mesh of independent but interoperable nodes that can be effectively governed and are easy to consume, you have a foundational pattern that can then be scaled massively across the organisation. Not only this, but each node can scale at its own pace, depending on its level of maturity.

‍

The federated data mesh, once set up properly, is highly scalable, which is a massive advantage of this approach.

‍

‍

Data Governance Federation Challenges

‍

A federated data mesh model requires a high degree of data maturity in an organisation as it represents a very different and more free-flowing way of allowing domains to interact with each other and with the data itself than with more top-down, centralised approaches.

‍

But the main challenges around federation of data are not technical. The real challenge lies in federating a data mesh culture and mindset: the ways of working and thinking that must underpin this shift in how we handle data.

Federating trust

Your organisation will have to be comfortable with federating not only their technology but their trust.

‍

A mindset shift is required to ensure that each domain has the skills, infrastructure and controls in place to allow it to act autonomously, within the guardrails of inter-domain interoperability.

‍

There are too many domains, however, to manage them all individually (and this would also defeat the purpose of decentralisation!). These domains, then, need to be trusted to get on with the job however they see fit, which some organisations that are used to more centralised control may find unsettling initially.

Encouraging good data citizenship

When each domain is given the trust for their particular piece of the data puzzle at the same time that domain takes on a huge amount of responsibility.

‍

Organisations must make clear that the new ways of working are in place to make life easier for everybody and for the common good of the organisation.

‍

For data mesh to succeed people—whether they are data producers or consumers—need to be actively contributing to their corner of data mesh.

Striking the right balance

Imagine that every domain had complete autonomy to manage their own data as they wish, with absolutely no consideration for cross-domain consistency or co-ordination. There would be carnage.

‍

Similarly, if domains were completely reliant on a centralised data function to manage and make data available then that would become a major bottleneck and innovation would grind to a halt.

The challenge is to find the right balance for your particular organisation between allowing domains to evolve and scale their own data at their own pace while ensuring the data products that result are consistent with other domains.

‍

Critically, this balance will change over time as the organisation matures and so must be constantly adjusted.

‍

Final Thoughts

Data governance federation is the secret sauce that makes data mesh possible, making highly-autonomous, local work possible, but within interoperability guardrails that allow for high degrees of collaboration between all the local teams.

‍

This combination of local excellence and inter-domain collaboration creates a massive web of high-quality data products that all corners of the business can draw on to enhance existing services or foster innovation.
‍

‍

Federated Data Governance FAQs

‍

What is meant by a federated database?
‍

A federated database is a type of database management system that allows multiple autonomous databases to operate as a single, unified system. Unlike centralised databases, federated databases maintain the autonomy of each individual database while enabling data integration and sharing across the federation.

‍

What is the difference between a centralised and federated data model?
‍

A centralised data model consolidates all data into a single, central repository, where it is managed and accessed from one location. This model ensures uniformity but can create a single point of failure and potential bottlenecks.

‍

A federated data model, on the other hand, allows multiple autonomous databases to operate together as a unified system. Each database maintains its independence while enabling data sharing and integration across the federation.

‍

What is the difference between data virtualisation and data federation?
‍

Data virtualisation creates a real-time, unified view of data from multiple sources without moving the data. It abstracts the technical details of data storage, allowing users to query and retrieve data seamlessly. In contrast, data federation combines data from different sources into a single system while maintaining their autonomy.

‍

Find out more about how we apply Data Mesh to bring the value out of your data estate.

Read more on enabling security and compliance at speed and scale with federated data governance.

Latest Stories

Driving Reinvention: The Four Key Ingredients to Scaling AI Within the Insurance Industry

8 Apr

What Business Leaders Need to Understand to Unlock Agentic AI value

4 Apr

Data Analytics Explained – Five Essential Lessons

3 Apr

I would like to receive marketing communications regarding Mesh-AI news, services and events.

You may unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review Privacy Policy.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.