Introduction to Data Mesh with Zhamak Dehghani

Stanford Deep Data Research Center

544

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 6 лют 2025
Zhamak Dehghani works with ThoughtWorks as the director of emerging technologies in North America, with a focus on distributed systems and data architecture, and a deep passion for decentralized technology solutions.
Zhamak serves on multiple tech advisory boards including ThoughtWorks. She has worked as a technologist for over 20 years and has contributed to multiple patents in distributed computing communications, as well as embedded device technologies.
She founded the concept of Data Mesh in 2018 and since has been implementing the concept and evangelizing it with the wider industry.
She is the author of Architecture the Hard Parts and Data Mesh O'Reilly books.
About Data Mesh:
For over half a century organizations have assumed that data is an asset to collect more of, and data must be centralized to be useful. These assumptions have led to centralized and monolithic architectures, such as data warehousing and data lake, that limit organizations to innovate with data at scale.
Data Mesh as an alternative architecture and organizational structure for managing analytical data.
Its objective is enabling access to high quality data for analytical and machine learning use cases - at scale.
It's an approach that shifts the data culture, technology and architecture
from centralized collection and ownership of data to domain-oriented connection and ownership of data
from data as an asset to data as a product
from proprietary big platforms to an ecosystem of self-serve data infrastructure with open protocols
from top-down manual data governance to a federated computational one.
About Deep Data Research Computing Center:
We bring in computer scientists’ expertise and need to validate their work to help biologists and biometricians scale research. Over the years, the lowering costs of sequencing leads to an overwhelming amount of biomedical data generated. While this helps biologists and biometricians unlock deeper insights, they often find themselves lack the time or knowledge to acquire, store, and analyze data in a scalable way.
Our multidisciplinary team builds cost-efficient systems for research studies with large number of participants and different types of data, with the aim to collect, store and analyze data across multiple platforms, while preserving fast turnaround times. By using robust de-identification and end-to-end encryption methods, we reduce the risk of violating privacy and security regulations.
Our vision is the ability to collect and utilize more data at ease accelerates the transformation from conventional medicine to deep medicine. In conventional medicine, we are considered sick when our health status diverts from the population’s average. Deep medicine reveals the wide individual differences by investigating a variety of domains, and unlocks personalized insights to prevent diseases and manage daily health.
Connect with us:
Website: deepdata.stanf...
Twitter: @DeepDataSU
LinkedIn: / deep-data-stanford

КОМЕНТАРІ • 18

@ranwar123 Рік тому ⁺¹²
To some, Zhamak Dehghani is the Godfather of Data Mesh( or Godmother if want to be grammatically correct), the least Stanford could do was produce a better quality video.
@quinterohenry Рік тому
Great approach
@sunilpipara 2 роки тому ⁺³
Good one
@theBATfamiliar Рік тому ⁺⁴
What a superb and talented person is Zhamak Dehghani, and she is very beautiful as well.
@this-is-bioman 9 місяців тому ⁺²
We've got 2024. What kind of sound quality is this? Gi!
@quinterohenry Рік тому
Is possible build this kind of infraestructure starting form a silos architechture?
@ericgeorge7874 Рік тому ⁺¹
Yes after a lot of data (re)modeling and deciding/agreeing who (domain) are the owners. Then implement technically (which was not talked about in this theoretical video). Technically, as in where the each domain data will reside and how each will be accessed relationally.
@riwajbhattarai3977 Рік тому ⁺¹
Data mesh restructuring requires multi-million-dollar investments depending on the size of its enterprise, its previous architecture, and tech adoption readiness. Siloed architecture means gatekeepers such as either one central IT team does data access and distribution, or decentralized silos where each team has its own independent systems, and you need to ask each systems/gatekeepers for data that you may require. Instead, data mesh promotes the idea that, instead of having independent data systems or one central IT for access and distribution, why not bring data into decentralized data ownership and management, with shared governance and interoperability?
In simple terms, data mesh says, let's bring all the data from these multiple siloed or centrally IT-owned systems into something like a data lakehouse. Then, let teams who need that data take whatever they need themselves. And now come the domain product owners. While learning this concept from an enterprise standpoint, a data product owner who owns the Customer domain may have 200 attributes in his customer master data. For a company like Meta, for advertisers, there can be a million identifiers of the attribute because so many people advertise, right? Ad impressions can be a table with hundreds of millions of fields. The Advertising analytics team says that while certain data can be necessary or useful, there are 20 attributes that I absolutely need to function. These may consist of certain attributes of the Customer Domain, certain attributes of Product, certain attributes of Marketing and advertising tables, and such.
The traditional silo system makes this a very difficult and slow process for advertising analytics teams to get the data they need, having to go through multiple siloes to get all the data they need. However, with data mesh, this problem is now eliminated because the data they need is egressed into a lakehouse and enabled with federated queries/ SQL/SparkSQL, allowing the Forecasting team to take the data they need.
The data product owner's job here will be to bring all the data that the team needs from the source system into the local system and maintain it in such a way that it is ready to be consumed by the advertising analytics team to enable them.
@iblaine Рік тому ⁺⁴
A Data Mesh is a collection of subject specific data warehouses. There's no technology behind the idea. Unpopular opinion perhaps, I think the Data Mesh exists to bill consulting hours.
@alw015 Рік тому ⁺¹
Spot on!
@centerfield6339 9 місяців тому
Agreed.
@morespinach9832 9 місяців тому
So what in your mind is the issue to a very real enterprise problem?
@iblaine 9 місяців тому ⁺²
@@morespinach9832 Advancing the data management industry ought to be done with ideas closely coupled with technology. Good examples are star schemas and event driven architecture. Both are ideas backed by intuitive, strict technical requirements.
@napalm5 Рік тому
No connection to data mesh group?
@centerfield6339 9 місяців тому
This doesn't seem novel. Only the author (and her rather kind colleagues at Thoughtworks) seems to be ve promoting it as a concept, for self advancement reasons.
@James-l5s7k Рік тому ⁺¹
LOL STANFORD! Wallow in your disgrace, you were caught cheating too much!
@alw015 Рік тому ⁺³
I really envy people who can create a presentation about nothing, write a book, and talk without conveying anything new besides a few slogans.
@FerrySharma-x2d 4 місяці тому
🤣🤣🤣

Наступне

Автоматичне відтворення