EXACTLY what I had been looking for! Just the perfect combination of technical and beginner terminology. Thank you! Looking forward to watching the rest.
Thank you so much!!! You're the best! I'm an auto mechanic (with only client/user knowledge in computers) with ADHD going to school for IT. Your explanation makes it easy for me to picture how it works since my neurodivergent brain understands video and analogies much more than text-dominant courses.
This was an absolutely stellar explanataion. So clear and concise. During the video I was coming up with questions in my head that were immediately answered in the next clip 👍
Hello Michael - Thanks for taking time and explaining the content in a concise manner. I have two quick questions 1. In Big compute databases why can't make each individual node has some redundancy in order to avoid from fault tolerant? 2. In high available databases We are basically doing vertical scaling on each individual node but again reached to out first problem there will a limit in the increase of hardware. How will handle this scenario in this context?
1. In theory, it definitely can! And I _believe_ that bigquery actually does have redundancy built in under-the-hood. However, in practice, since "high availability" isn't always a priority, they often don't. 2. Great question! This is where things get tricky -- in general, what will happen is that they'll keep partial copies of the data on each node instead of doing a full replica. However, this adds an additional layer of complexity of course as the nodes then need to manage and "know about" which slices of the data are being stored on which nodes in the case of a network partition.
@@databasesdemystified7747 doubt in second question the "manage and know about" layer how is it different from big compute layer is it not the same solution
The converse, actually -- shared nothing is more amenable to big compute as each node is just responsible for processing its own data. Shared everything is for high availability -- it means that each node has a full copy of the data, so if another node goes down the service keeps functioning. Also it's important to think of these concepts along a spectrum with the amount of shared data on the nodes ranging from all to none but also with lots of possibilities in between.
Not exactly. MPP just means Massively Parallel Processing. It could be on a distributed database or a non-distributed database. It generally does mean that the data are stored in columns rather than rows, though.
The channel is so underrated. Definitely deserve more subscriptions!
EXACTLY what I had been looking for! Just the perfect combination of technical and beginner terminology.
Thank you! Looking forward to watching the rest.
A simple yet enhanced overview of Distrubuted database. It was hard to find a real good tutorial on distributed db. This video was perfect.
Thank you so much!!! You're the best! I'm an auto mechanic (with only client/user knowledge in computers) with ADHD going to school for IT. Your explanation makes it easy for me to picture how it works since my neurodivergent brain understands video and analogies much more than text-dominant courses.
Clear, understandable explanation for non-IT people as well!
thanks for explaining all of this. All your lessons have been very insightful for newbies like me.
Glad to hear that!
This was an absolutely stellar explanataion. So clear and concise. During the video I was coming up with questions in my head that were immediately answered in the next clip 👍
Very well processed!
Clear and easy to follow!
Thanks again
Just what I was looking for! I really like your contents and I hope you'd make more of this 😊
if there was such a thing as too useful, this would be it. thanks a lot brother
I liked it.Good Work
Thank you so much for such clear and nice explanation.
Thanks for the content...it really helped me to understand what are distributed DBs
Hello Michael -
Thanks for taking time and explaining the content in a concise manner.
I have two quick questions
1. In Big compute databases why can't make each individual node has some redundancy in order to avoid from fault tolerant?
2. In high available databases We are basically doing vertical scaling on each individual node but again reached to out first problem there will a limit in the increase of hardware. How will handle this scenario in this context?
1. In theory, it definitely can! And I _believe_ that bigquery actually does have redundancy built in under-the-hood. However, in practice, since "high availability" isn't always a priority, they often don't.
2. Great question! This is where things get tricky -- in general, what will happen is that they'll keep partial copies of the data on each node instead of doing a full replica. However, this adds an additional layer of complexity of course as the nodes then need to manage and "know about" which slices of the data are being stored on which nodes in the case of a network partition.
@@databasesdemystified7747 doubt in second question the "manage and know about" layer how is it different from big compute layer is it not the same solution
Very informative and succinct
cool video, was really helpful!
thank you for sharing. It helped me a lot to understand distributed databases
Great content!!!
last question... is shared disk the same as the big-compute paradigm and the shared nothing the high-availability paradigm???
The converse, actually -- shared nothing is more amenable to big compute as each node is just responsible for processing its own data. Shared everything is for high availability -- it means that each node has a full copy of the data, so if another node goes down the service keeps functioning.
Also it's important to think of these concepts along a spectrum with the amount of shared data on the nodes ranging from all to none but also with lots of possibilities in between.
Perfect explanation :)
Insightful 👍🏻
Thank you so much! really excellent video 😍😍👍👍
Great explanation! Thank you! :)
also, is distributed databases the same as MPP?
Not exactly. MPP just means Massively Parallel Processing. It could be on a distributed database or a non-distributed database. It generally does mean that the data are stored in columns rather than rows, though.
amazing thanks
Thanks
THANK YOU
YOU ARE WELCOME