"You need to be able to attach the code to the data"...I interpret that as the code being metadata attached to the data. Would this include governance type rules and the contract between producer and consumer?
Yes. The idea is to ensure that your data product can stand alone. To stand alone, it needs to be self documenting. It should provide metadata, governance/usage rules, and details of the data contract via the control port. How you implement that is up to you, but it's probably easiest to imagine these things being delivered up by a RESTful service. The trick is to ensure there's a common standard for how that description service behaves and what it provides so that other data consumers and data products can interact with these data products across the organization.
"Data Performance" is I think the elephant in the room of data mesh. Putting DWH on the failure-pattern side is confusing implementation/development with technology. I think this exposes soon issues when physics constraints become stronger than software needs. For example: how to deal with strong consistency or massive cross domain/product aggregation on frequently changing data ? Dealing with that with an application-centric approach will not work. The solution is to co-locate loosely coupled data products on RDBMS/MPP systems. No one prevent those data product to evolve autonomously still avoiding the communication overhead of physically isolated deployments. Am I missing something?
By far one of the best analysis on the topic
Excellent, honest, practical and no BS ♥️
Bingo- ''Data as a code''! is going to come sooner than later. Great insight by Joe! Thanks Joe for sharing this wonderful session on Data Mesh.
Can i get the slides please ?
When you discuss virtualization technologies, what do you mean by "data prod need lower level control" ? thnx a lot
"You need to be able to attach the code to the data"...I interpret that as the code being metadata attached to the data. Would this include governance type rules and the contract between producer and consumer?
Yes. The idea is to ensure that your data product can stand alone. To stand alone, it needs to be self documenting. It should provide metadata, governance/usage rules, and details of the data contract via the control port. How you implement that is up to you, but it's probably easiest to imagine these things being delivered up by a RESTful service. The trick is to ensure there's a common standard for how that description service behaves and what it provides so that other data consumers and data products can interact with these data products across the organization.
What is the software that you reference #1 and #2 on the opportunities for innovation slide…?
"Data Performance" is I think the elephant in the room of data mesh.
Putting DWH on the failure-pattern side is confusing implementation/development with technology.
I think this exposes soon issues when physics constraints become stronger than software needs.
For example: how to deal with strong consistency or massive cross domain/product aggregation on frequently changing data ?
Dealing with that with an application-centric approach will not work.
The solution is to co-locate loosely coupled data products on RDBMS/MPP systems. No one prevent those data product to evolve autonomously still avoiding the communication overhead of physically isolated deployments.
Am I missing something?