Merkle Tree with real world examples

Поділитися
Вставка
  • Опубліковано 27 лип 2019
  • Merkle Trees are used in popular software applications like Git, Amazon Dynamo DB and BlockChain. A merkle tree is a metadata-structure, which stores the hash of the combined children in each parent. Any change in a child requires a change in the parent node.
    This property of identifying state changes is useful when validating the correctness of data, as calculating the inverse of a hash is computationally infeasible. The merkle tree's properties are suitable for applications where a malicious user must be stopped from changing data.
    Git file creation details:
    Git registers any change in a file as an entirely new file. It stores the content of the file with a filename of SHA1(fileContents). Let's say this SHA1(fileContents) = x. If x already exists in the git file system, it doesn't go ahead and create it. This avoids unnecessary recreations when metadata about a file is changed.
    When a file is modified in Git, git creates a new (changed)file, and all it's changed parents. The root subsequently changes. It's like a branch of a Persistent Data Structure. If you want to go back a commit, you use the root of the previous commit, which still points to the old file.
    In this way, Git takes snapshots of every point of change in the project. Perfect for a version control system.
    It is also useful to find the points at which the data has changed, using a Merkle Tree. Amazon Dynamo DB uses merkle trees to reduce entropy (Anti-entropy technique) when a new node is added to the Dynamo DB cluster.
    Amazon Dynamo DB data migration details:
    When copying a range assigned to a node, we have to allow concurrent updates to keep the service available. But when the migration is in progress, concurrent updates lead to stale data in the destination node.
    So we copy the data range multiple times. When there is no change in the two nodes, we declare the destination node as consistent. Since we want this to happen as fast as possible, we try to minimise the number of copy iterations. The procedure is explained in the video.
    System Design Playlist: • System Design for Begi...
    Become a channel member!
    / @gkcs
    References:
    Git: stackoverflow.com/questions/4...
    Consistent Hashing: • What is CONSISTENT HAS...
    NoSQL: • Introduction to NoSQL ...
    Amazon DB: www.allthingsdistributed.com/...
    Anti Entropy without Merkle Trees: haslab.uminho.pt/tome/files/d...
    You can follow me on:
    Facebook: / gkcs0
    Quora: www.quora.com/profile/Gaurav-...
    LinkedIn: / gaurav-sen-56b6a941
    Twitter: / gkcs_

КОМЕНТАРІ • 152

  • @AndyDavid1970
    @AndyDavid1970 2 роки тому +5

    Brilliant! Really enjoyed that - quick delivery, but engaging and lively. Already looking forward to more from this knowledgeable and gifted communicator.

  • @prabhuwali9
    @prabhuwali9 4 роки тому +29

    "When you change the contents of the file the hash should also change", it's called Avalanche effect. Even a minute change in the input causes a very drastic change in the output, a very important characteristic of a a good cryptographic algo.

    • @KejriwalBhakt
      @KejriwalBhakt Рік тому +1

      Something like Butterfly effect?

    • @anguswett
      @anguswett Рік тому +2

      @@KejriwalBhakt yes, good analogy

  • @rehmanalimomin6569
    @rehmanalimomin6569 4 роки тому +8

    And the best part is... "Ye to apne jaisa banda hain yaar".
    Hence we all are able to absorb more from your videos, cause we are able to get sync in your explanation, which is kinda rare.

    • @gkcs
      @gkcs  4 роки тому +2

      😁

  • @royarijit998
    @royarijit998 4 роки тому +5

    This was by far the best explanation for Merkle Trees.. Thanks for making it easy, Gaurav!!!
    It feels really awesome when you make me understand a new topic really well...

    • @gkcs
      @gkcs  4 роки тому +1

      Thank you 😁

  • @ctjanney
    @ctjanney 3 роки тому +3

    Great explanation. Thank you. I work in visual effects, and these is a great way to track the version of all assets that make the final frame going to out to client.

  • @MrDroy13
    @MrDroy13 4 роки тому +2

    The video was great as usual. Keep making such amazing content. It took me some good amount of reading and coding to understand the implementation of this.

    • @gkcs
      @gkcs  4 роки тому +1

      Thanks 😁

  • @subarukun8001
    @subarukun8001 4 роки тому

    I like how you share your passion for problem solving although several things you say I do not understand 100% makes me excited to know more

    • @gkcs
      @gkcs  4 роки тому +1

      Hahaha, thanks!

  • @Meethu69
    @Meethu69 2 роки тому

    OMG your voice is so soothing to listen to ...feels like a friend is explaining

  • @RaunaqSoni96
    @RaunaqSoni96 4 роки тому

    Looking sharp man. Great video as always.

  • @Mikesco3
    @Mikesco3 3 роки тому

    Excellent video, great job explaining the information through examples.

  • @syrym_the_professor
    @syrym_the_professor Рік тому +1

    It is nice having GOOD comments, but BAD comments are also required. My personal recomendation is slow down a little bit, explain step by step, because as not native american it is hard to understand FAST. I needed some time, but anyway thank you for video.

  • @clarangelical.618
    @clarangelical.618 11 днів тому

    thank you so much! loved the energy and delivery.

    • @gkcs
      @gkcs  10 днів тому

      Thank you!

  • @jainsvnit
    @jainsvnit 4 роки тому +2

    Last month only wrote Merkle Tree library for blockchain application. I was not aware of usage of Merkle Tree in DynamoDB. Loved the video.

  • @BillBurrComedy
    @BillBurrComedy 3 роки тому

    Very informative! Keep up the good work!

  • @roman-still
    @roman-still 6 місяців тому

    You're great at explaining things bro!

  • @siddardhakatiki
    @siddardhakatiki 4 роки тому

    You have been really amazing. Thank you for all the content. :)

    • @gkcs
      @gkcs  4 роки тому +1

      Thanks 😁

  • @AbhisarMohapatra
    @AbhisarMohapatra 4 роки тому +6

    Nice one. Merkel trees are used in Dynamo category of databases and especially in Cassandra to handle something called Anti-Entropy(Basically to repair inconsistency during replication). Actually, even those handle anti-entropy using three strategies 1. Hinted Handoff 2.Read Repair and finally in background Merkel Tree kicks in for a wholistic repair

    • @gkcs
      @gkcs  4 роки тому +1

      Yup, I read an article on this and found it brilliant. Thanks for mentioning it 😁

    • @AbhisarMohapatra
      @AbhisarMohapatra 4 роки тому

      @@gkcs Yes .Especially the part where inconsistencies are handled in different ways in a different path. Read, write and then Whole.

  • @SaurabhKumar-jk4nl
    @SaurabhKumar-jk4nl 4 роки тому

    Thank you, You make everything easy :D

  • @vijaykrishnamenon
    @vijaykrishnamenon Рік тому

    Very Nice. Well explained! Keep up the gr8 job ...

  • @shivam2kumar988
    @shivam2kumar988 4 роки тому

    Hey man, I was just thinking of such a data structure. Thanks for introducing it to me!

    • @gkcs
      @gkcs  4 роки тому

      :D

  • @murtuza.chawala
    @murtuza.chawala 3 роки тому

    Superb, keep on the great work !

  • @sicktastictv
    @sicktastictv Рік тому

    Great explanation!

  • @rahulpadalkar6237
    @rahulpadalkar6237 4 роки тому +3

    Hi gaurav, as always a great video. Just wanted to ask, what is your video making process. Like how much do you read, try out things practically before making a video on that topic?

    • @gkcs
      @gkcs  4 роки тому +2

      Hey Rahul,
      I think I'll make a video on this sometime 🙂

  • @mounikachillamcherla4328
    @mounikachillamcherla4328 4 роки тому

    Love the way you explain :)

    • @gkcs
      @gkcs  4 роки тому +1

      Thank you 😁

  • @lingeshvirinmoonsamy8692
    @lingeshvirinmoonsamy8692 4 роки тому

    Awesome explanation Gaurav. Keep up the great content. 👍

    • @gkcs
      @gkcs  4 роки тому +1

      Thank you 😁

  • @mohammednagdy6661
    @mohammednagdy6661 5 місяців тому

    Great explanation.

  • @rajmiglani4159
    @rajmiglani4159 4 роки тому

    Hi Mate,
    Had been watching your vedios and they seem great. Interestingly these have aroused in me a desire to learn software architecture. Just wanted to know any suggestions of any particular course or book to begin with?

  • @RockHudrock
    @RockHudrock 3 роки тому +2

    Time for a new dry-erase marker 😆
    Awesome vid!

  • @franciscobaralle124
    @franciscobaralle124 9 місяців тому

    amazing explanation

  • @AlgoHacks
    @AlgoHacks 4 роки тому

    One of the best videos.

  • @ishansharma2214
    @ishansharma2214 4 роки тому +3

    Keep making such videos

  • @JavaAidTutorials
    @JavaAidTutorials 4 роки тому

    Awesome content bro..!!

    • @gkcs
      @gkcs  4 роки тому

      Thank you!

  • @channel8048
    @channel8048 Рік тому

    Great video!

  • @MagicSwordYT
    @MagicSwordYT 3 роки тому

    great interview questions

  • @RAJESH2010able
    @RAJESH2010able 4 роки тому

    Thanks, Gaurav for sharing. I have a question regarding System Desing I think you might be able to help. How to go about designing a web portal which shows most reposted posts in the last 5 minutes 10 minutes and 20 minutes?

  • @singhashutoshk
    @singhashutoshk 4 роки тому

    Keep up the good work man...

    • @gkcs
      @gkcs  4 роки тому +1

      😁

    • @singhashutoshk
      @singhashutoshk 4 роки тому +1

      Really... I learn so much from you and you really encourage me

  • @jayeshborgaonkar9166
    @jayeshborgaonkar9166 4 роки тому

    great video,.. keep going

  • @shobhanksharma65
    @shobhanksharma65 3 роки тому

    From your ring diagram of dynamodb use case, it looks like the synchronization is happening for any nodes. As I understand from dynamodb paper, the use case for merkle tree is for replication synchronization, so only for replicating nodes. Is that right?

  • @joaoluismoraes7215
    @joaoluismoraes7215 Рік тому

    I have a healthcare SaaS that deals with Electronic Medical Records, patients' data etc... Would it be a good idea to use Merkle trees ensure integrity of the data (pdf files, data...)?

  • @zephyrus.9
    @zephyrus.9 4 роки тому

    What a video. Can you explain lazy indexing in Cosmos DB?

  • @ARCHITSANGHVI13
    @ARCHITSANGHVI13 4 роки тому +4

    Hey Gaurav, your videos really helped me with my interviews these days. Recently in Morgan Stanley's interview, the interviewer asked me about what consistent hashing is and why would you use it. As I also did a project on Distributed Hash Key-value store, system design interview went for almost 45 minutes. One question which bothered me in the interview was, he asked me what if one of your clients comes up with 1lakh insert queries, as you are using consistent hashing, the query will be directed to only one of the server depending on the hash value. So just to get away from that question, I said that I will use consistent hashing for read queries only, as we will get results from the cache and it would be effective, but for insert query, I will distribute it on random servers. Can you please help me with the proper answer.

    • @gkcs
      @gkcs  4 роки тому

      Have a look at how Cassandra does it.
      Hint: Quorum.

    • @rohitsharma-vf2tk
      @rohitsharma-vf2tk 3 роки тому

      Hey Archit.wont the consistent hash uniformly distribute the inserts ?

    • @abhishekkarn8918
      @abhishekkarn8918 2 роки тому

      W=n, R=1 in Quorum consensus will solve it

  • @yongyang7631
    @yongyang7631 4 роки тому

    Thanks for sharing! I have a question, how do you handle hash conflicting in Merkle tree? saying two different values are hashed to a same hash value?

    • @gkcs
      @gkcs  4 роки тому

      What's the chance of that?

  • @ronbarbieri7713
    @ronbarbieri7713 4 місяці тому

    Thank you!

    • @gkcs
      @gkcs  4 місяці тому

      You're welcome!

  • @jeevithar9648
    @jeevithar9648 2 роки тому

    How Merkle Hash Tree performs better when compared to other tree structures?

  • @arpanjain3560
    @arpanjain3560 4 роки тому

    And in case of Dynamo DB. How the data is going to be transfered. Does it going to be transfered after comparison of two nodes Merkle Trees.

  • @arpanjain3560
    @arpanjain3560 4 роки тому

    Really great. But yeah I want a video on BlockChain using mecle tree

  • @hussainshaikh5615
    @hussainshaikh5615 Рік тому

    is it possible to generate merkle tree with rootHash
    i meant to get the values of merkle tree using rootHash ?

  • @vikasamar
    @vikasamar 4 роки тому

    Great explanation! In the Nosql consistent hashing example, as the updates are continuous - wouldn't the merkle tree always be different between the two nodes? How would the system converge?

    • @gkcs
      @gkcs  4 роки тому +2

      One way would be to block updates for a short while when the difference between them is small. The sync will finish quickly and then updates can resume.

    • @johnyepthomi892
      @johnyepthomi892 Рік тому

      @@gkcs temporary blocking and sparse Merle tree for transfer. Thanks Gaurav.

  • @AbhishekSura
    @AbhishekSura 4 роки тому

    As we consider git as a tree data structure, will that use dfs or bfs?

  • @BipinOli90
    @BipinOli90 4 роки тому

    awesome

  • @bhaskarsuthar7600
    @bhaskarsuthar7600 4 роки тому +1

    Thanks Gaurav for explanation. One request, can you create separate system design video on git system design considering most powerful features of git ? This was asked to me in one of recent interview, I explained with DAG (as I was not aware about Markel Tree :P ) but it was not appropriate though.

    • @gkcs
      @gkcs  4 роки тому +2

      I have done a lot of research on the git system, so it's possible...
      I'll take a poll and see if it's what the viewers want 😁

    • @bhaskarsuthar7600
      @bhaskarsuthar7600 4 роки тому

      @@gkcs : Sure !! Thanks :)

  • @abhijeetrastogi120
    @abhijeetrastogi120 4 роки тому

    Great explaination :)

    • @gkcs
      @gkcs  4 роки тому

      Thanks 😁

    • @abhijeetrastogi120
      @abhijeetrastogi120 4 роки тому

      @Gaurav have you ever implemented a merkle tree from scratch ? Lets say even for a 1D array with 5 elements

  • @meetzaveri734
    @meetzaveri734 4 роки тому +2

    What if React's Virtual DOM would be using this data structure behind the scenes like for updating the HTML tree for small UI change? It has react conciler measuring diffs only

    • @gkcs
      @gkcs  4 роки тому +3

      Sounds interesting, and I have a feeling something like this is used in DOMs...

    • @indiansoftwareengineer4899
      @indiansoftwareengineer4899 4 роки тому

      @@gkcs If you find out then please share, I am working on react and it seems nice.
      your videos are really helping. also, System Design playlist is best on youtube.

  • @arpanjain3560
    @arpanjain3560 4 роки тому

    And how do we figure out which Parent Merkle tree has to be compare with which Parent Merkle tree of another node.

  • @ILike13acon
    @ILike13acon 4 роки тому

    If you only take 1/10 values for the Merkle tree, do you have to do this 9 more times to check the other values? And then repeat this looped process until the data has stopped updating? Great video by the way!

    • @gkcs
      @gkcs  4 роки тому

      We take all the changed values, but that doesn't stop the data at the source being updated again. Which would need another round of copying to fix. But that one might have updates too, which would need another round. And so it goes...

  • @mr.carkid5600
    @mr.carkid5600 4 роки тому +1

    Hey gaurav, make a video on merkle tree blockchain. thanks in advance , your videos are very helpful thanks in advance. great work keep going.

  • @pushkarkumar5118
    @pushkarkumar5118 4 роки тому

    Just want to say thank you.

    • @gkcs
      @gkcs  4 роки тому +1

      😁

  • @harshraj22_
    @harshraj22_ 4 роки тому

    Great video. Liked the github part.

    • @gkcs
      @gkcs  4 роки тому +1

      Thanks!

  • @br4676
    @br4676 4 роки тому

    very nice

  • @haryk5068
    @haryk5068 4 роки тому

    hi gaurav, what are benefits of joining the channel for 4.99/month? do channel members get access to more videos?

  • @mentasuavesuave01
    @mentasuavesuave01 Рік тому

    Hello from Spain, I love your lessons but as a non-native English speaker understanding different accents is really challenging.
    It will be great if you could subtitle the lessons.
    Thank you for your time and effort.

    • @gkcs
      @gkcs  Рік тому

      Thank you! I'll try and subtitle the videos 👍

  • @agentNirmites
    @agentNirmites 4 роки тому +3

    These videos you upload are awesome.
    But why don't you upload videos of programming? I mean real time programming, not just ppts...

  • @RandomGuy-pr7gt
    @RandomGuy-pr7gt 4 роки тому

    Good content Gaurav. I’m curious how git handles hash collisions, when changes of a file generate same hash. My naive idea is adding timestamp to the content and generating the hash.

    • @gkcs
      @gkcs  4 роки тому +1

      It doesn't. If they match, the file is left unmodified.
      The chances of that happening are lesser than an asteroid blowing Earth apart soon. So I feel rather safe 😛

    • @indiansoftwareengineer4899
      @indiansoftwareengineer4899 4 роки тому

      @@gkcs best answer I ever have seen with an analogy. I am gonna use it in Interview. :P

  • @kaushikdr
    @kaushikdr 4 роки тому

    So the directory above is pointing to the hash, which is a pointer? I thought the hash was just an encrypted version of the file; does it also act as a pointer?

    • @gkcs
      @gkcs  4 роки тому

      The hash tells Git the name of the file, so it is a address/pointer for this 'content addressable system'.

  • @gouravsingh9146
    @gouravsingh9146 4 роки тому +9

    Oh yeah ! Well groomed *gaurav* for first time.
    #ourtylerdurden.

    • @gkcs
      @gkcs  4 роки тому +2

      Hahaha!

  • @sandeepmerugu5472
    @sandeepmerugu5472 Рік тому

    Is it used in application like dropbox too?

  • @rudreshmehta6870
    @rudreshmehta6870 4 роки тому

    yes i want to see merkle tree in blockchain use case

  • @alterthepath
    @alterthepath 3 роки тому

    What soccer team does your shirt represent?

  • @kerron_
    @kerron_ 3 роки тому

    good video

  • @nishanksoni1604
    @nishanksoni1604 4 роки тому

    whiteboard text is not visible @Guarav in 1020p streaming also

  • @ashishkumargupta316
    @ashishkumargupta316 4 роки тому

    Gourav , actually I face difficulty in building up the recurrence relation
    For the problem. Could you suggest something to improve this or do you have any trick for this ???

    • @gkcs
      @gkcs  4 роки тому

      Which recurrence relation? Th subtrees of each node?

  • @jayshree9037
    @jayshree9037 4 роки тому

    a malicious person like me😂
    Thanks for the vid, tbh never even heard of merkle tree.

    • @gkcs
      @gkcs  4 роки тому +1

      Very malicious 😛

  • @lukerhoads
    @lukerhoads 3 роки тому

    best videos on youtube

  • @souvik1983
    @souvik1983 4 роки тому

    good job

    • @gkcs
      @gkcs  4 роки тому

      Thank you!

  • @prakyg
    @prakyg 2 роки тому

    How can a hash point to a file?

  • @jonasliao3780
    @jonasliao3780 4 роки тому

    thanks for the video! the resolution seems low for the texts on the whiteboard

    • @gkcs
      @gkcs  4 роки тому +1

      Yes, I've fixed that in the last two videos 😁

  • @shubhambansal5487
    @shubhambansal5487 4 роки тому

    Nice info, but there is one issue i.e., most of the times your camera goes out of focus. Please try to rectify it.

    • @gkcs
      @gkcs  4 роки тому

      Yeah I think I'll fall back to the phone cam because this is so persistent. Thanks for the feedback 😊

  • @itsawildworldisntit
    @itsawildworldisntit 8 місяців тому

    so it's a type of content addressable storage.

  • @wuddddd
    @wuddddd 4 роки тому +1

    Strongly request for the subtitles.

    • @gkcs
      @gkcs  4 роки тому

      Hoping for some subtitles from the community!

  • @rajatrao5632
    @rajatrao5632 4 роки тому

    where do you learn from ?...as this topic which you teach us🤗?

    • @gkcs
      @gkcs  4 роки тому +1

      There are some links in the description 😁

  • @visheshsrivastav2107
    @visheshsrivastav2107 4 роки тому

    Can merkel trees be used for effective datamasking in big data?

    • @gkcs
      @gkcs  4 роки тому

      Can you elaborate with an example?

    • @visheshsrivastav2107
      @visheshsrivastav2107 4 роки тому

      @@gkcs Like if i want my data to masked but in way that it still makes sense.I would most probably shuffle the data for example like i am trying to mask first names of customer but i want them that in original data a person having name Sam has name Tom in masked entry but the problem with this shuffle is when data is updated or changed the same name can be mapped to two entries leading to inconsistent data masking.

    • @gkcs
      @gkcs  4 роки тому +1

      Well, masking using hashing is quite common. You shouldn't have any problems if you store x as sha1(x), except if you are looking to get back the original value of x from the mapped value.

  • @anonymousshooter3636
    @anonymousshooter3636 3 роки тому

    Initial I read it has Miracle Tree. Now after seeing the Video, we can call it "Miracle Tree". Loll

  • @kamalhm-dev
    @kamalhm-dev 4 роки тому

    I can't comprehend what you were saying in the intro "Hey Everyone this is .... CS"?

    • @gkcs
      @gkcs  4 роки тому +1

      Gkcs is my pen name 🙂

    • @kamalhm-dev
      @kamalhm-dev 4 роки тому

      @@gkcs I see! thanks

  • @kaushikdas417
    @kaushikdas417 4 роки тому

    @8:37 shouldn't it be mod of 4 instead of 400?

    • @gkcs
      @gkcs  4 роки тому

      The keyspace size is 400

  • @guillermo1874
    @guillermo1874 3 роки тому

    That marker sucks but the content is Great! That marker does not do justice 😅

  • @gouthamnagraj5445
    @gouthamnagraj5445 4 роки тому

    Can't this be solved using AKKA actor-based model?

    • @gkcs
      @gkcs  4 роки тому

      How?

    • @gouthamnagraj5445
      @gouthamnagraj5445 4 роки тому

      @@gkcs I think I replied when the there was a autoplay running 😂😂

    • @gkcs
      @gkcs  4 роки тому

      @@gouthamnagraj5445 Hahaha cool 😛

  • @manieshsh
    @manieshsh 4 роки тому

    Hey Gaurav, Can you advise how can I get a job in US from India. I have 6 years of experience, in Datawarehouseing initially and now in Big Data Analytics. I love programming and I am proficient in SQL and Python/Scala. Would appreciate your opinion on it.
    PS : Love what you do brother.😘

  • @Adi-xb7fh
    @Adi-xb7fh 4 роки тому

    Hello sir i had a doubt regarding placements.
    I messaged you on LinkedIn but i think you didn't see it yet. Can you tell me the best forum to ask my doubt?

  • @prateekgurnani902
    @prateekgurnani902 4 роки тому

    Bro, I love your channel. You are doing a great job. But please don't associate yourself with that idiot AlgoExpert Guy. His company's terms and conditions are as terrible as hell. Bought his course yesterday and now I am regretting since I can't get my money back. I know that you have a partnership with him or whatever but companies like AlgoExpert who don't even refund money to their customers on the same day of buying even if customers are not satisfied with their product are a big shame. Keep on making good content !!!!

  • @bhaktirs1668
    @bhaktirs1668 3 роки тому

    It's a patch not a pack. Nice video though

  • @Jaggedleafcollectibles
    @Jaggedleafcollectibles Рік тому

    Watch this video at 0.75x speed.. Thank me later

  • @rudreshmehta6870
    @rudreshmehta6870 4 роки тому

    Too

  • @rudreshmehta6870
    @rudreshmehta6870 4 роки тому

    Congrats for job at uber

  • @SereneMornings
    @SereneMornings 4 роки тому

    Little less explauanation

  • @brianjester123
    @brianjester123 2 роки тому

    Great explanation!