Understanding HDFS using Legos

Поділитися
Вставка
  • Опубліковано 25 сер 2024
  • You've been hearing about Hadoop and HDFS. How does it work?
    In this video, we use an innovative method to show how HDFS works with Legos. Jesse Anderson shows how HDFS handles files and replicates the data, then covers the read and write paths for the data. Finally, he talks about how HDFS handles failure scenarios and the importance of data locality.
    ** Hadoop & Apache Spark training from NewCircle: newcircle.com/...

КОМЕНТАРІ • 111

  • @davidvella7141
    @davidvella7141 3 роки тому +18

    This video was uploaded 5 years ago and it still one of the best explanations I've ever seen.

    • @ThorstenStaerk
      @ThorstenStaerk 2 роки тому

      the first explanation at all that could tell me what Hadoop has to do with MapReduce

  • @stackinglittlesats
    @stackinglittlesats 5 років тому +332

    Nice explanation man, if I could, I would buy you an air conditioner.
    You deserve it.

    • @SIRabhinav
      @SIRabhinav 5 років тому +7

      may the force be with you

    • @miguelchris6374
      @miguelchris6374 3 роки тому

      sorry to be offtopic but does anyone know a trick to log back into an Instagram account?
      I was stupid forgot my account password. I would love any tricks you can give me.

    • @miguelchris6374
      @miguelchris6374 3 роки тому

      @Billy Dariel it worked and I actually got access to my account again. Im so happy!
      Thank you so much you really help me out!

    • @billydariel9140
      @billydariel9140 3 роки тому

      @Miguel Chris Happy to help :D

    • @ZeTamboh
      @ZeTamboh 3 роки тому +9

      @@billydariel9140 nice ad

  • @shaikabdussalaam5431
    @shaikabdussalaam5431 Місяць тому

    You really have a " hands-on" approach of teaching this. : )

  • @taxatlanticinc6611
    @taxatlanticinc6611 6 років тому +7

    This is a best explanation I have seen yet! It's a lot more engaging and informative than the traditional PowerPoint! Thank You!!

  • @TheMrsStinsfire
    @TheMrsStinsfire 7 років тому +35

    11:06 R.I.P. Data Node 3

    • @supermonkey965
      @supermonkey965 5 років тому +4

      He was a good node, admired by his node friends.

    • @ElCuchu
      @ElCuchu 4 роки тому +1

      I'm still crying, can't get over it, such a good node dude rippp

    • @parthnagdev
      @parthnagdev 3 місяці тому

      He is happy in the Node Heaven and is saving all the replicated data it ever wanted to save.

  • @Coffingdw
    @Coffingdw 8 років тому +25

    Nice job Jesse. Very informative and creative. Thank you.
    TeraTom

  • @wisdomandpeace4897
    @wisdomandpeace4897 9 років тому +3

    Excellent video. I actually understand Hadoop somewhat after watching this video.

  • @maryoleary8660
    @maryoleary8660 3 роки тому

    I love learning with legos, even watching it at 1.5x, I was able to follow along easily. Well Done.

  • @MuzamilKhan-rl2sh
    @MuzamilKhan-rl2sh 4 роки тому +1

    Wow man, you explain it in a creative way.

  • @thandekilenzungu7240
    @thandekilenzungu7240 3 роки тому

    The explanation is so clear I understood everything

  • @ravianantharamaiah7567
    @ravianantharamaiah7567 3 роки тому

    Excellent teaching. Conceptually things are very clear now. Thank you.

  • @olesyagorbacheva6991
    @olesyagorbacheva6991 8 місяців тому

    Thank you for such a good explanation!

  • @AdrianRodriguezWebDevelopment
    @AdrianRodriguezWebDevelopment 8 років тому +1

    This video just made my day! Thank you New Circle Training! And thank you Doug Cutting for sharing this video on Twitter.

  • @amitprakashpandeysonu
    @amitprakashpandeysonu 2 роки тому

    Really nice and innovative way to teach hdfs concept. loved it and understood it very clearly. Thank you.

  • @haydo8373
    @haydo8373 6 років тому +6

    Superb, I had it running at 1.5x and it was still easy to follow! Thanks :)
    Can you explain every CS concept with Lego? - that would be amazing

  • @AhlamLamo
    @AhlamLamo 4 роки тому

    Amazing explanation !! one of the best videos I've seen about HDFS

  • @thndesmondsaid
    @thndesmondsaid Рік тому

    Jesse! Great explanation as always.

  • @ThomasEhardt
    @ThomasEhardt 7 років тому

    Great introduction to HDFS!

  • @prohouse6088
    @prohouse6088 8 років тому

    very nice teaching methodology jesse, thanks for sharing

  • @gustavogbfBR
    @gustavogbfBR 9 років тому

    Nice work. Really help me to understand how HDFS works.

  • @yicai7
    @yicai7 4 роки тому

    Great explanation! Voted!

  • @1234abcd2139
    @1234abcd2139 7 років тому +3

    nice illustrative way of teaching HDFS. Would have been wonderful if some more information was given about fallback mechanism of Name node or coordinator

  • @dingman081130
    @dingman081130 7 років тому

    gorgeous presentation, thanks

  • @klausdupont6335
    @klausdupont6335 6 років тому

    Incredible illustration! Would love to see more on this topic in this form :-)

  • @CB-fz3li
    @CB-fz3li 4 роки тому

    Nice clear explanation

  • @alaayari6391
    @alaayari6391 3 роки тому

    thanks for the explanation

  • @abhijeet_r
    @abhijeet_r 6 років тому

    Very innovative presentation thanks a lot!

  • @Happymoon789
    @Happymoon789 5 років тому

    Thanks for your efforts! Smart display!

  • @MSDlublin
    @MSDlublin 7 років тому +1

    Very good work for begginers - THANKS A LOT!

  • @Guitarman01
    @Guitarman01 7 років тому +2

    Good Presentation, however I do have question. Since the file is split to other nodes, doesn't replication also take places so that if a node does go down, then you can retrieve. Node 3 went down, but I would have figured I could get it from another node. Does master save a copy of all files as well? I didn't see how that works on the video.

  • @joecordingley7071
    @joecordingley7071 8 років тому +2

    This was great, thanks.

  • @bugs181
    @bugs181 9 років тому +3

    I'm just now learning about the methods used in distributed file systems. I'm an application developer and it's a bit difficult to wrap my head around the lower level storage systems like HDFS.
    This video explained replication in an easy to understand way. Now only if I could have one other BIG question answered.
    What kind of file system would we use if we want an application to use a virtualized file system stored over many nodes? For example, we want each node to add additional storage capacity.
    To the application layer, this would look like a single big storage drive but to the lower level facilities this would use network coordination to serve the files to the application.

  • @underlecht
    @underlecht 4 роки тому

    looks like we have a perfect explaination!

  • @CarlosMercadoINIGTDY
    @CarlosMercadoINIGTDY 6 років тому

    Excellent video, thanks!

  • @i_e_she
    @i_e_she 6 років тому

    This was great, thank you! Should have more views.

  • @markhellel3371
    @markhellel3371 6 років тому

    Great Job Jesse! Nicely done! :-)

  • @mrdhksan
    @mrdhksan 5 років тому +1

    Excellent, thank you. A serious question: What would happen if two of the four nodes crash?

  • @nirupamaj6140
    @nirupamaj6140 7 років тому +1

    very informative, thank you

  • @kausaralam2605
    @kausaralam2605 8 років тому

    Great explanation!

  • @ahmedaj2000
    @ahmedaj2000 3 роки тому

    Thanks 😊

  • @abhimanyukarkara4218
    @abhimanyukarkara4218 4 місяці тому

    Question: when we have to read from let's say the red file. Would all three nodes be processing simultaneously different data (chunks) and give us an combined output or would only one node process the complete the data processing alone?

  • @manojprabhakar5522
    @manojprabhakar5522 4 роки тому

    Awesome, Thank you for the explanation, Could you please make videos of Spark with Yarn and how the communication is handled?

  • @ulrikkallblad6698
    @ulrikkallblad6698 5 років тому +1

    Very nice video! Only one question: If node 3 is down, how can the data from node 3 be moved to the other nodes?

    • @forbin80
      @forbin80 4 роки тому +1

      @@brianbitchballs3902 thanks for the great explanation BrianBitchBalls

  • @taharhalloub8721
    @taharhalloub8721 3 роки тому

    Thanks a lot

  • @BabtaOfEinGedi
    @BabtaOfEinGedi 6 років тому

    Perfect! Thanks so much

  • @nocontentnoname5922
    @nocontentnoname5922 4 роки тому +1

    Did we find who broke node 3 yet?

  • @jesusoliveros9950
    @jesusoliveros9950 6 років тому

    Amazing !!! great Job

  • @elwyndude
    @elwyndude Рік тому

    If a node goes down, why does it need 3 replications to pull the data, could it not just read from the existing two?

  • @CosmeJunior
    @CosmeJunior 6 років тому

    Nice Job. Brazil thanks you!

  • @Gorlung
    @Gorlung 3 роки тому

    what does happen when you add a new and empty node?

  • @marflem12
    @marflem12 6 років тому

    Thank You

  • @joseenrique6723
    @joseenrique6723 2 роки тому

    For the red file, are EACH of the replicas still 64 mb in size?

  • @AbhinavSingh-oq7dk
    @AbhinavSingh-oq7dk 3 роки тому

    if a data node malfunctions, then name node instructs remaining data nodes to create replicas of files that malfunctioned data node held. why create another replica when there are two others already? I mean they are there for the backup, right? Do correct if I am missing something. Thanks.

  • @TzGiwrgos15
    @TzGiwrgos15 7 років тому

    Brilliant!

  • @rimchatti3807
    @rimchatti3807 5 років тому +1

    Nice job, it is helping getting familiar with HDFS. I'm new to Data Engineering and so on.. Could you please explain to me what is a cluster.
    Thanks;

  • @arisweedler4703
    @arisweedler4703 7 років тому

    Great explanation! I assume that another benefit of the HDFS is that reading large files will be quicker, because you would be able to effectively "BitTorrent" from your cluster. Does HDFS do that?

  • @wow376
    @wow376 5 років тому +1

    feel like buying Legos already!

  • @barefeg
    @barefeg 3 роки тому +1

    What if hulk smashes the naming node?

  • @JackyA123
    @JackyA123 5 років тому

    yoou haveabsolutely no need to be nervous! Doing a great job here

  • @satwindersetia4367
    @satwindersetia4367 7 років тому +1

    Very creative, indeed.

  •  2 роки тому

    Interesting, it's very similar to how Elasticsearch works

  • @buzz-uk
    @buzz-uk 5 років тому

    Hi,
    While setting up pseudo or full cluster, do we need to format data node with HDFS file system or we only have to format namenode.
    I am asking this because, I have read this on many blogs that, HDFS stores the data in sequential order on the hard disk and it is an abstract layer which stores data on big blocks rather than default block size storage provided host file system.
    If we are not formatting datanode than the powerful feature of HDFS will come to toss.

  • @mahdiamrollahi8456
    @mahdiamrollahi8456 3 роки тому

    Hello, nice job,
    I have a question, as a file system how hadoop can manage a database file(like mssql or mysql) file? how can it devide an .MDF file to other separeted files to store them on different machines? Because such files, have meta data and overhead and they are not like a basic txt file. Does hadoop have special system to treat each file type differently? Regards.

  • @shyland20
    @shyland20 6 років тому

    why s3 service streaming with embedded link is slow (get stock every 2-3 second) when embedded on wp site? after understanding what you saying how can i improve the speed? i read something about the hdfs but i don't understand how it's related to s3 if at all. thanks in advance

  • @amni5tianone263
    @amni5tianone263 2 роки тому

    tnx

  • @stivstivsti
    @stivstivsti 6 років тому

    thanx!

  •  6 років тому

    The fact that hbase write on a node as you say is the reason why it corrupts the HDFS filesystem so easily?

  • @myeverymusic
    @myeverymusic 5 років тому

    What will happen once Data Node 3 is alive again? will the Name Node asks other nodes to copy some data to Data Node 3?

  • @Irresponsibleful
    @Irresponsibleful 5 років тому

    did you get a AC by now ?

  • @malesamuel7736
    @malesamuel7736 5 років тому

    Cool

  • @GiacomoMilazzo
    @GiacomoMilazzo 6 років тому

    I don't understand. If each set of blocks is "one" file (red, yellow, blue) why he says that blocks are replicated? He should say "distributed", not replicated! Replicating involve data resiliency, erasure code and so on. Is it?
    Then he put the case that one of the cluster's node crash. So in this case replications come on play. And, of course, he should not call the set of blocks "one" file. But it should say there's one file composed of "n" chunks that are replicated among nodes of cluster.

    • @draganglumac
      @draganglumac 6 років тому

      The way I understood it, = .
      I suppose (if my understanding is correct of course) the confusion then comes from the fact that at the beginning of the video he said that a = .
      He really should have started with just one row of lego bricks for each file, and just explained that a data node sends a copy of the block it just wrote to one its data peers as directed by the control node.

  • @KYC_life
    @KYC_life 7 років тому

    Now I like Legos :)

  • @user-oi3ce5nj3m
    @user-oi3ce5nj3m 4 роки тому

    狡兔三窟说的就是这个意思

  • @danielleu.877
    @danielleu.877 4 роки тому

    SUPER informative, but also i hear "Hadoop" and just think "Hadooken" just me? yeah okay hahaha

  • @samiulsaeef2076
    @samiulsaeef2076 3 роки тому

    play in 1.25x

  • @user-ml2ci7wl1f
    @user-ml2ci7wl1f 5 років тому

    英语不太好,但是觉得很棒

  • @ravatmehul3906
    @ravatmehul3906 7 років тому

    Nican

  • @marcelscherzer8385
    @marcelscherzer8385 4 роки тому

    its lego, not legos... but nice vid.

  • @guille.p
    @guille.p 7 років тому

    It started off pretty well but then it got very confusing. He didn't seem so sure of what he was saying. It didn't work for me. Thank you, anyway.

  • @JM-fp3gf
    @JM-fp3gf 9 років тому +16

    Why is he so sweaty?

    • @jessetanderson
      @jessetanderson 9 років тому +16

      Yeah, it was the lighting. We tried moving the lights around, but I didn't have any makeup on which mitigates the lights.

    • @musasall5740
      @musasall5740 7 років тому +6

      You should not answer this moron. u doing a good job for free

    • @vishusingh008
      @vishusingh008 7 років тому +1

      Looks like you are a moron!

    • @vishusingh008
      @vishusingh008 7 років тому +3

      He replied so kindly and genuinely, you are calling him moron.

    • @stonemysterioserusss
      @stonemysterioserusss 7 років тому +6

      Pretty sure he was referring to the initial commenter, not Jesse. Rude remark nevertheless.

  • @cafecapes
    @cafecapes 9 років тому

    Why do Americans call Lego bricks Legos? Lego is a company name and small building bricks is what they make, they don't make Legos! You can't implicitly type Lego bricks as Legos it sounds silly.

    • @lucaborzani56
      @lucaborzani56 9 років тому

      we do the same in Europe. Where are you from?

    • @cafecapes
      @cafecapes 9 років тому

      I've been thinking deeply about this and decided I'm the worst person to be dictating English.
      Briton mate.

    • @bugs181
      @bugs181 9 років тому

      cafecapes Every nation has it's own way of speaking. There's a very elaborate section on the Stack Exchange website that goes into great depth on the difference in languages, pronunciation, and word usage. If this is a serious inquiry, I'd suggest you go there. It's a very informative place to learn anything you want - and if the question hasn't already been asked, you can ask it yourself.
      One example is that there is a topic on how Americans pronounce the word solder as "sodder" while other countries pronounce it as "sold-er" and where this distinction came from. You might be surprised to know that the language variations have a lot to do with heritage dating way way back. Every language and dialect, regardless of what it is has become bastardized - and it's just a part of how languages evolve.
      For what it's worth, I used to pronounce it as "sold-er" until I got tired of being corrected - and I have no indication of where I learned this from. I now colloquially pronounce it as "sodder" just because of tradition and geographical linguistics. Also, pronunciations and accent vary widely in the United States from coast to coast also.
      Apologize for the long comment.

  • @viewerone
    @viewerone 5 років тому

    It's been quite a challenge to hear this video. Headphones are in but it doesn't seem to help.

    • @FredroStarr12
      @FredroStarr12 5 років тому

      audio volume is fine to me, must be an issue on your machine

    • @viewerone
      @viewerone 5 років тому

      Freddy yes, that’s what it was. Guess my Mac needed a reboot. Worked fine afterwards.