Understanding B-Trees: The Data Structure Behind Modern Databases

Поділитися
Вставка
  • Опубліковано 21 лис 2024

КОМЕНТАРІ • 450

  • @muno
    @muno 6 місяців тому +1219

    I really like how the little guys turn their heads to look at the thing you're talking about :)

    • @sid98geek
      @sid98geek 6 місяців тому +27

      They are very cute and innocent.

    • @Scrolte6174
      @Scrolte6174 6 місяців тому +3

      He does this in every video, Sherlock.

    • @Slitherman96
      @Slitherman96 6 місяців тому +25

      @@Scrolte6174not everyone watches what you watch, Sherlock

    • @thezipcreator
      @thezipcreator 6 місяців тому +21

      @@Scrolte6174 what does that have to do with anything? even if they do it in every video you can be appreciative of it?

    • @theowallace2310
      @theowallace2310 6 місяців тому +6

      Having completed data structures and algorithms at an ABET accredited institution, I nod my head knowingly at this video.

  • @mooseyard
    @mooseyard 6 місяців тому +469

    Beautiful explanation. I love the animations. As someone who's implemented a few persistent B-trees, allow me to point out a few more details:
    * This is the classic original B-tree. However, most databases use a B+tree, which is different in that the values are stored only in the leaves; keys in upper nodes just point to lower nodes. When a node splits, you don’t move the middle value up, it stays in one leaf or the other.
    * B-trees I’ve looked at, like SQLite, don’t have a fixed number of keys in a node. In real usage, keys and/or values are variable size, like strings, and the nodes are fixed-size disk pages (often 4kb.) The number of keys or values that fit in a node is highly variable. So instead you keep adding to a node until its size in *bytes* overflows a page, and then split. Some nodes might have a hundred keys, some might have only four. It doesn’t matter; the algorithms still work.

    • @yad-thaddag
      @yad-thaddag 6 місяців тому +11

      Thank you! Very interesting!

    • @tylerpetrov8094
      @tylerpetrov8094 6 місяців тому +8

      Learning something new everyday! Thanks for the info

    • @TheJamesM
      @TheJamesM 6 місяців тому +5

      Yes, I was thinking about the latter point while I watched: "How do you decide on the number of keys per node? I guess you size it to just about fit into available memory, in order to minimize the number of expensive database queries." If the keys were of consistent size, you could just divide the memory you want to allocate by that size, but in most contexts I can see how it would be more practical to divide keys by size rather than by count.
      Do you get problems if e.g. you have adjacent large keys? Would that result in "wasted" space in a node?

    • @ZenoDovahkiin
      @ZenoDovahkiin 6 місяців тому +2

      That makes sense, thanks!

    • @RomanShchekin
      @RomanShchekin 5 місяців тому +6

      Quite useful comment to be honest. Real life scenarios are super useful in pair with the academic explanation!

  • @asishreddy7729
    @asishreddy7729 6 місяців тому +110

    Such a simple but beautiful animation! So many channels do such complex animations but they do not realise simple animations can be so beautiful.

    • @davidbatista1183
      @davidbatista1183 6 місяців тому +3

      Has a Wall-E kind of feeling 😊

    • @SantiagoArizti
      @SantiagoArizti 6 місяців тому +2

      I found it confusing that the nodes left undimmed were the ones we were supposed to pay attention to. Didnt you?

    • @nikilragav
      @nikilragav 3 місяці тому

      @@SantiagoArizti no, but I found it confusing that he would undim parts of the tree as he said the sentence rather than just undimming the whole section together

  • @MaxPicAxe
    @MaxPicAxe 6 місяців тому +220

    Lol that's actually my first time hearing about a b tree, looks awesome and elegant and simple

    • @thacium
      @thacium 6 місяців тому +26

      Elegant indeed but far more complex than a binary tree. In this case you're trading simplicity for faster search time.

    • @paulstelian97
      @paulstelian97 6 місяців тому

      @@thacium Some say that a special case of b-tree is equivalent to the red-black tree, another strategy which is, however, a binary search tree as well.

    • @TheWizardGamez
      @TheWizardGamez 6 місяців тому +4

      Knock on wood before your up at 2 AM on a Red Bull and adderall fueled binge trying to figure out how to shave off 1 ms of process time.

    • @ronak212
      @ronak212 6 місяців тому +5

      Ain't nowhere near simple but magnificent for sure

  • @0tobsam0
    @0tobsam0 4 місяці тому +22

    I have an exam on algorithms and data structures in 3 days and this video manages to break down hours of lectures into 12 minutes... Incredibly helpful

  • @amongusztav655
    @amongusztav655 5 місяців тому +26

    I didn't understand B-trees from university classes but now I do. 3 days before exam. Thank you!

    • @vastabyss6496
      @vastabyss6496 5 місяців тому +2

      how did it go?

    • @amongusztav655
      @amongusztav655 5 місяців тому +1

      @@vastabyss6496 I passed it but I only got a kettes/elégséges so I'm going to attend the repair exam

  • @MichaelChin1994
    @MichaelChin1994 6 місяців тому +244

    Fed up with my living room being a mess, I decided to watch this. Oddly think it's going to help

    • @KaiHenningsen
      @KaiHenningsen 6 місяців тому +28

      Keep the mess in a b-tree?

    • @Y2Kvids
      @Y2Kvids 6 місяців тому +9

      use c trees

    • @azizayari252
      @azizayari252 6 місяців тому +7

      Well you're gonna need to sort it in an ascended or descended order first

    • @KaiHenningsen
      @KaiHenningsen 6 місяців тому +1

      @@azizayari252 What for? That happens automatically when putting it in the tree.

    • @Shake_Well_Before_Use
      @Shake_Well_Before_Use 6 місяців тому

      ​​Christmas?a​@@Y2Kvids

  • @TheSilentknight.1
    @TheSilentknight.1 6 місяців тому +78

    Just saying thank you from behalf of the community for those amazing visualization teaching vids and for the quality you put in them

  • @peterpesch
    @peterpesch 6 місяців тому +7

    Wow! Great explanation!
    Basically the same as how we learned it 45 years ago, but with these animations it takes much less time. Back than, the professor was running around with chalk to change the diagrams on the chalkboard ...

  • @Eckster
    @Eckster 6 місяців тому +13

    So good, the easiest explanation for such a common data structure I've seen. It's crazy how we teach all these other trees in computer science classes but leave out the most used one.
    I especially appreciated the performance advantages against binary trees being explained.

    • @neilcarrier1620
      @neilcarrier1620 6 місяців тому +1

      I agree; it's a good example of how different measures of efficiency lead to different solutions.

  • @vibhavkadavakollu4669
    @vibhavkadavakollu4669 12 днів тому

    You're awesome bro, I couldn't understand this concept from 1 week of lectures, but you just explained it in 10 minutes.

  • @FutureAIDev2015
    @FutureAIDev2015 6 місяців тому +18

    This channel has been a massive help for me in understanding the concepts in my algorithms class well enough to actually pass the assignments.

    • @isbestlizard
      @isbestlizard 6 місяців тому +1

      MIT opencourseware also do a great series on algorithms plus the professor is SUPER CUTE he's the dude into origami

  • @Browsinghard
    @Browsinghard 6 місяців тому +3

    Very elegant explanation and animation, you cleared up my misunderstandings from when I just read about b-trees. Your little blob dudes are great communicators!!

    • @GH-oi2jf
      @GH-oi2jf 4 місяці тому

      "B-trees" please. Capitalized.

  • @LordHonkInc
    @LordHonkInc 6 місяців тому +25

    Man, I remember having balancing B-Trees as assignments in my CS exams and failing to get the right answers. This video does a much better job of explaining it than any class I took in university in a fraction of the time we spent learning it. Thanks for finally getting me to understand👍

  • @esra_erimez
    @esra_erimez 6 місяців тому +70

    This is a great video. Normally, databases do not store data in internal nodes, only leaf nodes and function as intermediate pointers. Leaf nodes on the other hand contain the actual data entries or pointers to those data entries. Additionally, most databases have pointers to neighboring leaf nodes. By the way, these pointers to neighboring leaf nodes help lot with solving concurrent operations on the B-Tree.

    • @sohampatel5166
      @sohampatel5166 6 місяців тому +18

      yeah those are known as B+ Trees

    • @mooseyard
      @mooseyard 6 місяців тому +8

      A different approach to concurrency is copy-on-write, where instead of changing a node in place you make a copy of it with the changes. This means changes don’t affect any concurrent readers since nodes are immutable. However, any nodes that point to the modified node also have to be updated to point to the new one, which means a change spreads up to the root node.
      (This may sound wasteful, but in practice you can modify new nodes in place because no one else can see them. So a node only has to be copied once during any transaction.)
      In this type of tree you can’t have links between siblings because you’d end up having to copy all the siblings too. Instead, when you’re going down the tree you keep a breadcrumb trail, your path, and to get to a sibling you look up, move one child, and go back down.
      This type of tree is used by CouchDB and LMDB, among others.

    • @nikilragav
      @nikilragav 3 місяці тому

      when do you get concurrent operations on the same DB? How is it not queued?

    • @esra_erimez
      @esra_erimez 3 місяці тому +1

      @@nikilragav There is a paper titled "Efficient locking for concurrent operations on B-trees" by Bing Yao & Philip Lehman that was the break through

    • @nikilragav
      @nikilragav 3 місяці тому

      @@esra_erimez Thanks. Is the idea that you have multiple threads accessing the same db? No queue?

  • @mskiptr
    @mskiptr 6 місяців тому +7

    I more or less knew how B-trees work already, but this was just such a neat refresher that I now want to implement it (maybe together with a couple of formal verification proofs, to show that all these properties are always maintained)

  • @kamalzubairov2344
    @kamalzubairov2344 6 місяців тому +3

    Awesome explanation. I always had the feeling that I almost understand how B-trees work, but I wasn't quite there yet. This video showed me the things that I was missing. Thank you!

  • @offtheball87
    @offtheball87 6 місяців тому +3

    I've just started the CS50 AI course as background learning at work, and imagine my surprise to hear this voice again. I really enjoy your teaching style, and I'm so glad to have found more content from you!

  • @paulstelian97
    @paulstelian97 6 місяців тому +13

    There's an additional bonus -- nodes tend to be able to fill in some special size, like a disk block or a cache line, which allows further efficiency when doing comparisons for a search. That's why for some applications it's objectively better than even the red-black tree.

  • @The_Pariah
    @The_Pariah 4 місяці тому +1

    This is a REALLY good series and I wish Spanning Tree would post more content.
    They have a fantastic layout, information is conveyed well, and it's thorough but not so technical that people don't understand it.
    Please make more content. It's hard to find solid video series like this that explain important programming concepts.
    10/10

  • @willowtreeangel
    @willowtreeangel 3 місяці тому

    Ah, I always struggled with B-trees during my Database Design class. If I had this video, I wouldn't have struggled at all! Fantastic work, very informative.

  • @jensdit
    @jensdit 6 місяців тому +3

    Awesome video! Just one clarification: database systems typically use B+-trees rather than B-trees to allow for ISAM (range search on leave nodes).

  • @caleblandis3694
    @caleblandis3694 6 місяців тому +2

    This is literally the best data structure explanation video I’ve ever seen

  • @yoyoyo-hw2lc
    @yoyoyo-hw2lc 2 місяці тому

    Absolutely Brilliant Explanation and Animation. I had never encountered B-Trees and became interested in them because they are not in A-level Computer Science Exam Boards. This video taught me B-trees in just 12 minutes and encouraged me to build my own in C++.Thanks!

  • @spliterator1981
    @spliterator1981 6 місяців тому +1

    I'm listening to "designing data intensive applications", and wasn't quite able to visualize a b-tree through audiobook only. This really cleared everything up. Thanks!

  • @pesetskyps
    @pesetskyps 3 дні тому

    Bravo for traversing the complexity into simplicity 🎉

  • @69k_gold
    @69k_gold 6 місяців тому +5

    This is the best visual learning channel for CS on UA-cam

  • @freebirdyaoyao
    @freebirdyaoyao Місяць тому

    Fantastic illustration for one of the most complex topic in computer science education. I wish I get to learn this video when I was at university .

  • @IvanToshkov
    @IvanToshkov 6 місяців тому +8

    I love your videos. You somehow always manage to hit the right amount of details. Great job!

  • @ivanthomas8503
    @ivanthomas8503 6 місяців тому +3

    you published this at the exact right time I have a final exam today and this will be on it and I needed to study up on it.

  • @rafazieba9982
    @rafazieba9982 6 місяців тому +27

    I've been a professional developer for 17 years now. A really good one. I always knew that BTrees are used to store database indexes and that they reduce search complexity to logarithm. I knew how binary trees and AVL trees work. I always wondered how BTrees work. I was too lazy to do the reading. Now I know. Thank you.

    • @walttroianivargas5200
      @walttroianivargas5200 3 місяці тому

      Something similar happened to me, I knew the tool and used it but i didn't fully grasp it. Now that I'm making my own SQLite in Rust is really helpful!

    • @JohnSmith-op7ls
      @JohnSmith-op7ls 2 місяці тому

      Unless you’re writing a DBMS, it’s super unlikely you need to know how b-trees work

  • @HolyC-xs2zj
    @HolyC-xs2zj 6 місяців тому +3

    So well animated, so simply and informatively explained. Thank you!

  • @docjoesweeney
    @docjoesweeney 6 місяців тому +16

    Ha! Thanks for this. I remember coding b trees waaaay back in the early 80s. Gawd I'm old.

    • @crosswalker45
      @crosswalker45 6 місяців тому

      Which language do you used to code in 80s?

    • @poopytowncat
      @poopytowncat 6 місяців тому

      _"early 80s"_ OK boomer! I'm approaching my "early 80's" and I remember coding stuff waaaay waaaay back in the "early 70's" on IBM punch cards.

    • @docjoesweeney
      @docjoesweeney 6 місяців тому +3

      @@crosswalker45 vb, ada , assembler (for 6809), vulcan then dbase and clipper, kman, pascal... likely a few others i am too to recall.

    • @crosswalker45
      @crosswalker45 6 місяців тому +1

      @@docjoesweeney I always wonder how people in 80s, used to write the code. I'm assuming there won't be any proper documentation or persistent internet connection for that matter.

    • @docjoesweeney
      @docjoesweeney 6 місяців тому

      @@crosswalker45 dig up old editions of Dr Dobbs magazine. I am sure someone would have scanned them for prosperity.

  • @professorpoke
    @professorpoke 6 місяців тому +13

    Proud to say that I am following this channel since when it has less than 100K subscribers.

    • @MichaelDeHaven
      @MichaelDeHaven 6 місяців тому

      Dude, I just found it yesterday! It's a great channel and deserves the growth.

  • @leo_bonifacio
    @leo_bonifacio 6 місяців тому +1

    Thank you a lot for the beautiful explanation of this topic! I always encounter the question about indexing algorithms and used data structures in databases for a data engineer position. So this video helps a lot in understanding b-trees. Thanks again!

  • @trocchiettoski
    @trocchiettoski 6 місяців тому

    is the first time i subscribe to a channel after i saw 5 seconds and skipped a couple of minutes after other 3 seconds. All your effort needs to be encouraged

  • @srivarsha9574
    @srivarsha9574 2 дні тому

    The best video out in the internet for B Tree!! Thank you so much!!

  • @AndySterkowitz
    @AndySterkowitz 2 місяці тому

    Excellent video. Well done on articulating this in such an easy to understand way.

  • @HPerrin
    @HPerrin 6 місяців тому +3

    This is a great explainer video. Very easy to follow and I love the little robot guys.

  • @alessio1253
    @alessio1253 Місяць тому

    You may have just saved my exam. Incredible video

  • @wissiw5276
    @wissiw5276 6 місяців тому +5

    i wish i could have a teacher that utilizes simple animations like these in class, the visualisation makes it so clear

  • @Crafterchen2
    @Crafterchen2 Місяць тому

    This helped me a lot! The Animations are really well done and help explaining the thing you're talking about. This is how explanatory videos should be done!

  • @jiwan88
    @jiwan88 6 місяців тому +8

    Amazing video and explanation.

  • @nithssh
    @nithssh 6 місяців тому +2

    I was looking to implement btrees a while back and all the literature on it were conflicting and varying. I like how you handled all the variances subtely.
    This is a great video, the definitive one on btrees for sure. Cheers.

    • @GH-oi2jf
      @GH-oi2jf 4 місяці тому

      The term is "B-tree."

  • @sripriyansvbsbtskanchibhat9152
    @sripriyansvbsbtskanchibhat9152 4 місяці тому

    It's really an awesome way of explanation ruling out drawing the trees and explaining...no words to appreciate...thank you for making such wonderful videos...great work

  • @fdsfkdj
    @fdsfkdj 6 місяців тому +1

    Thank you for directly going into the subject.

  • @keobkeig
    @keobkeig Місяць тому

    Best explanation of btrees I’ve ever seen

  • @Lidonnn
    @Lidonnn 4 місяці тому

    With your explanation, it looks really simple and elegant. Thanks a lot!

  • @Drraghavsethi
    @Drraghavsethi 5 місяців тому

    every other video explaining B trees must be deleted to save space and time. This is perfect explanation.

  • @allanatal
    @allanatal 6 місяців тому +2

    That video was so well made that even I could understand. Code this algorithm is another story, though.

  • @angkhoi5740
    @angkhoi5740 6 місяців тому +2

    awesome, never understood this tree until now. Thanks a lot :)

  • @yagamilight120
    @yagamilight120 6 місяців тому +13

    You're a legend bro... Hope u live 100 more years

  • @saviofernandes5263
    @saviofernandes5263 6 місяців тому +2

    Stumbled upon this video by accident and I knew I heard that voice before 😂Nice to know that you have your own UA-cam channel now, Brian!

  • @waiphyotun7633
    @waiphyotun7633 6 місяців тому

    Why didn't I find this great channel with unique presentation earlier?

  • @Bess_Gates
    @Bess_Gates 13 днів тому

    Thank you man , it’s my first time to understand this

  • @aliveli8650
    @aliveli8650 6 місяців тому +3

    Very good explanation!! Congrats!! Keep it going!

  • @richardwilliamsmusic
    @richardwilliamsmusic 5 місяців тому

    Phenomenal teaching! please do more computer programming related concepts, love this!

  • @motonoob-i2d
    @motonoob-i2d 4 місяці тому

    Really well done. Wish I had access to content like this when I went through school

  • @bokistotel
    @bokistotel 3 дні тому

    I have no words, just amazing!

  • @arkan7rb
    @arkan7rb 2 місяці тому

    thank you sir for most wonderful explanation for b trees i have ever seen i well share ur vid to friends it deserves to be seen

  • @vt_blurry2443
    @vt_blurry2443 5 місяців тому +1

    Beautiful animations with a very clean explanation! Thank you sir!

  • @HudsonGTV
    @HudsonGTV 9 днів тому +1

    Important timestamps.
    **Insertion Scenarios:**
    5:08 - Basic Insertion
    6:20 - Another Basic Insertion with Root partially full
    6:50 - Insertion with full root
    **Deletion Scenarios:**
    7:48 - Basic Deletion
    8:15 - Deletion of Key in Node w/Min Count (take sibling key - it becomes new separator - old separator is moved to original node where deletion occurred)
    9:42 - Deletion of Key when Sibling Nodes are also at Minimum Count (merge sibling and separator together into single full node)
    10:10 - Same as above, but causing Parent node to fall below Minimum Count as well
    10:28 - Deletion of key from middle of tree (new separator needed - either take largest val in left subtree, or smallest val in right subtree - may need to take key from sibling or merge 2 nodes together if it causes child node to fall below min key count)

  • @Skeffles
    @Skeffles 6 місяців тому +2

    Brilliant video! I had no idea that it was so closely related to binary trees.

  • @sharma7889
    @sharma7889 7 днів тому

    the best channel i've ever visited...all you need is some promotion..and then lets see where you will be!!All the best!!

  • @LaysarOwO
    @LaysarOwO 6 місяців тому +2

    im not joking when i say, i was literally looking into making my own db 2 days ago, but then the optimization algorithms discouraged me because it looked so complex. however, this video was so easy to grasp that i might just end up committing to the project

    • @MK-lh3xd
      @MK-lh3xd 6 місяців тому

      Curious. What is your motivation to create your own DB? If you are creating DB for learning or teaching, then it is fine.There are already many open source databases, that are quite good. If you are creating your own DB, you will also need to consider caching, and a lot more features.

    • @LaysarOwO
      @LaysarOwO 6 місяців тому +1

      @@MK-lh3xd just to learn stuff tbh. In my serious projects i use postgres

  • @reza.kargar
    @reza.kargar Місяць тому

    Best B-Tree explanation ever!

  • @canmertinyo
    @canmertinyo 6 місяців тому +1

    the expression was really simple and understandable thank you!

  • @UndercoverDog
    @UndercoverDog 3 місяці тому +1

    2:40 Searching
    4:00 B-Tree Rules
    4:36 Insertion
    7:50 Deletion

  • @caterpieasmr2699
    @caterpieasmr2699 4 місяці тому

    Thank you so much, i have an exam that includes b-trees, but ALL the explanations are plain text with a lot of nonsensical expressions and variables. This is so simple and straightforward compared to that c:

  • @aV5d9nlUBQ9
    @aV5d9nlUBQ9 6 місяців тому +5

    Your videos are awesome. Thanks, Brian!

  • @richbronson9082
    @richbronson9082 Місяць тому

    Great explanation. I really left feeling like I understood.

  • @xTerminatorAndy
    @xTerminatorAndy 2 місяці тому

    Andy is happy. Also "spanning tree" (protocol) is my friend in my role as systems administrator

  • @jindrichsirucek
    @jindrichsirucek 6 місяців тому

    Awsome expl🙏🙏🙏
    For me personaly at first was confusing that you dim not related part, I would intuitively expect get brighter the part where change was happening instead of dimming the rest..
    Great work, thx for expaining 🙏

  • @chanm01
    @chanm01 3 місяці тому

    Dude, these animations are sick.

  • @mihiragrawal6174
    @mihiragrawal6174 Місяць тому

    Beautiful animations and explanation. Really really thanks for this

  • @mrunknown5087
    @mrunknown5087 3 місяці тому

    the way you teach is magnificent. keep it up buddy

  • @mamahuhu_one
    @mamahuhu_one 2 місяці тому

    Nice! It would be interesting to see an explanation on red-black trees

  • @Intense_Cloud
    @Intense_Cloud 6 місяців тому

    The "monitos" (cute animations) got me engaged to watching, among the interesting information presented/narrated.
    Great video!🎉😊

  • @elzabethtatcher9570
    @elzabethtatcher9570 4 місяці тому

    Great video. I will probably implement B-tree in GO, as I begin to study this language.

  • @zachlandes5718
    @zachlandes5718 5 місяців тому

    Fantastic! I’d love a video on LSM trees and SSTables next!

  • @slobberingdog72
    @slobberingdog72 5 місяців тому +1

    Nice and easy explanation. Great job !!

  • @hufuhufu
    @hufuhufu 6 місяців тому +3

    Animation is amazing!

  • @dragonlordsaviour7005
    @dragonlordsaviour7005 5 місяців тому +1

    absolutely beautiful explanation

  • @TeamDman
    @TeamDman 6 місяців тому +3

    Very great video! Thank you for sharing, I love the style

  • @VictorBorah-Invincible
    @VictorBorah-Invincible 2 місяці тому

    I bet, if this video was available while I was doing my bachelor's in CS back then in 2009, I would have got two Gold Medals instead of one. Life was so hard at that time, black & white text was so boring and everything had to be imagined, and I often fancied myself no less than Einstein himself performing some thought experiments on DS. I remember, when I took hold of basic DS, I started tutoring my friends in exchange for fast food in our favourite restaurants. Fondly enough, even my workstation PC today is named after my favourite Physicist, Einstein, just because of how I mastered DS and that paved my way to my role as a Backend Architect.

  • @imveryangryitsnotbutter
    @imveryangryitsnotbutter 6 місяців тому +15

    2:36 When designing a b-tree, how does the programmer determine what the maximum amount of keys in a node should be? For which applications is it appropriate to set the limit to 2, and which applications should have a limit of hundreds? There's a happy medium somewhere, but how is that happy medium calculated?

    • @IvanToshkov
      @IvanToshkov 6 місяців тому +13

      It depends on your application. For example for a database system I'd go with a node size that can fit in L1 cache of the processor. Or maybe use the filesystem block size? And then delete the size by the size of an element to figure out the number of elements in a block. And then test, modify and repeat.

    • @Jason9637
      @Jason9637 6 місяців тому +2

      The best way is to simply try them and see what works best.

    • @tactic00l
      @tactic00l 6 місяців тому +4

      In practice comparisons tend to be very cheap compared to main memory access, combine that with the fact that the processor loads memory in lines of cache (64 bytes) you want to pick a node size at least 64 bytes large. The sequential prefetcher makes 128 bytes a good option too

    • @capability-snob
      @capability-snob 6 місяців тому +6

      There are a variety of locality effects that can come into play. tactic00l mentioned cache line size which is a good minimum for in-memory b-trees. You might also size them to fit nicely into a block on disk, so you can make more comparisons for each disk read operation. There are subtler impacts from prefetch logic and alignment that might be worth measuring, depending on your microarchitecture or allocator respectively. some considerations might not be just about latency of accessing memory; depending on your concurrency model, likelihood of paging, data structures you are storing as values etc you might find reason to increase or decrease the number of keys stored in a node.

    • @GH-oi2jf
      @GH-oi2jf 4 місяці тому

      It's "B-tree." The point of a B-tree is to reduce accesses to secondary storage. If your B-tree resides on disk, then the node certainly should be as large as can fit in a sector, because you don't want to be reading disk space that you don't use. Whether it should be two sectors, or four, or something else, depands on the characteristics of the device and your judgment about what will work well. If your entire B-tree will reside in RAM, then you might as well use a 2-3 tree, which is the minimum B-tree.

  • @konstantinzolotarev
    @konstantinzolotarev 6 місяців тому

    Surer clear and simple explanation! Thank you very much for your time and effort!

  • @MySisterIsASlytherin
    @MySisterIsASlytherin 4 місяці тому

    This is so simple it's almost magical!

  • @isbestlizard
    @isbestlizard 6 місяців тому +59

    Your homework assignment: write a b tree in python that handles inserts and deletions and unit tests that validate it works correctly

    • @lorenzosotroppofigo1641
      @lorenzosotroppofigo1641 6 місяців тому +22

      I had to make a goddamn Red Black tree in C without any external library for a single almost worthless point in university. I didn't know a data structure could have that many bugs.
      That thing python b tree doesn't scare me.

    • @arossfelder
      @arossfelder 5 місяців тому

      ​@@lorenzosotroppofigo1641damn, these rotations are evil 😂

    • @iamtimsson
      @iamtimsson 4 місяці тому +1

      run redundant calc and compare?

  • @Supernumerary
    @Supernumerary 4 місяці тому

    Early in your development you must study machine code, assembly language, if you want to assess the actual processor cost of architecture choices, such as designing search trees. This is an excellent presentation, but in reality there are many essential layers below what is discussed here.

  • @luczuo7950
    @luczuo7950 6 місяців тому +1

    Thanks ! From Université Paris Dauphine students ! Luc, Silva, Faudel, Barbara, Diao, Adlene and Djena

  • @TheYakup85
    @TheYakup85 4 місяці тому

    That's the great explanation I have ever seen. Thank you so much!

  • @krituos
    @krituos 5 місяців тому

    Much better then any course I took on the subject 🎉

  • @joshuaworman4022
    @joshuaworman4022 5 місяців тому

    amazing video. wish we could have this for litterally everything.

  • @heinz6196
    @heinz6196 6 місяців тому

    There is also the B-Tree with prefix compression, which is particularly relevant if you need to store long strings as keys in the B-Tree. Once a leaf page is overflowing after a key insert, simply split it to create two leaf pages. Promote a copy of the first key from the right new page up one level, but only take as many characters from that key as are necessary to differentiate it from the last key in the left page. Do this only when a leaf page needs to be split. This approach will help save space in the upper pages up to the root.
    Additionally, keep in mind that if you feed a B-Tree with already sorted keys, you will end up with half-filled pages, wasting space. In this situation, the point where you split the pages will give you a choice on how to balance the B-Tree effectively.

  • @2bfrank657
    @2bfrank657 6 місяців тому +30

    So the 'B' in 'B tree' DOES NOT stand for 'Binary'. Definately a chance for confusion there!

    • @Vercingetorix061983
      @Vercingetorix061983 5 місяців тому +2

      I name it Beatrice 😂

    • @heyythere
      @heyythere 4 місяці тому +5

      It stands for balanced tree, then there's also b+ trees that are just advanced version of it

    • @GH-oi2jf
      @GH-oi2jf 4 місяці тому +3

      I consider it to stand for "Bayer," its inventor, but it could be something else. Bayer didn't say. It definitely is not "Binary," though. Euler did not call "e" Euler's number, but everyone else does.

    • @JohnSmith-op7ls
      @JohnSmith-op7ls 2 місяці тому

      @@heyythereIt doesn’t stand for balance or binary but it is a type of binary

    • @leonYoong
      @leonYoong 2 місяці тому +1

      ​@@heyytherethere's actually no clear meaning

  • @zxuiji
    @zxuiji 6 місяців тому +1

    I prefer an offset tree. Say you have a buffer or file full of nodes. Every node has a fixed offset from the head. As long as you know the head's pointer or file id you can get any node by it's offset. This causes the offset to work as a node ID which is useful for fast record access.
    Sticking to only offsets/indices can create an issue around adding/removing nodes at random places. This is fixed by building 2 pairs of "next" & "previous" members into the nodes. One pair is for tracking removed nodes, the other pair is for tracking assigned nodes. In code the node access would look something like this:
    next = head + node.assigned.nextOffset
    or
    prev = head + node.assigned.prevOffset
    You can attach the pairs directly to the node or via a separate list - in which case you use the determine the offset of the data by the offset of the node like this:
    index = (node - nodeList) / sizeof(node)
    data = dataList + (index * sizeof(data))
    An example of where you'd want to do the latta is text. You'd want to perform common string operations on the text such as finding a sequence of characters. There're ready made functions for that sort of thing but they assume continuous data, not nodes. Identifying the node from the data can also be done easily by a similar process:
    index = (data - dataList) / sizeof(data)
    node = nodeList + (index * sizeof(node))
    This gives the best of all worlds. Fast adding/removing of nodes via linked list format, fast node lookup via indices/offsets, easy to perform common operations via preexisting functions. Sorting can be fast too simply by making a list of ID & integer value pairs to be sorted by integer value. Then simply going through the actual list and 1 by 1 swapping the existing data with the data that should be there. For a file the swapping would look something like this:
    read( nodeList, offsetA, nodeA, sizeof(node) )
    read( nodeList, offsetB, nodeB, sizeof(node) )
    read( dataList, offsetC, dataA, sizeof(data) )
    read( dataList, offsetD, dataB, sizeof(data) )
    write( nodeList, offsetA, nodeB, sizeof(node) )
    write( nodeList, offsetB, nodeA, sizeof(node) )
    write( dataList, offsetC, dataB, sizeof(data) )
    write( dataList, offsetD, dataA, sizeof(data) )
    The integer values is a case by case thing that can be calculated upon creating the data so the sortable integer list can just be kept as a 3rd list along side the node list and the data list. I've never had a reason to use binary trees as of yet.

  • @shubhamraj5557
    @shubhamraj5557 6 місяців тому +2

    Great video please continue making these

  • @hypergraphic
    @hypergraphic 6 місяців тому +1

    Great video! You made it very understadable.

  • @MICHAELMURITHI-y7y
    @MICHAELMURITHI-y7y 6 місяців тому

    Please keep uploading more content like this