Bloom Filters | Algorithms You Should Know #2 | Real-world Examples

Поділитися
Вставка
  • Опубліковано 17 лис 2024
  • Subscribe to our weekly system design newsletter: bit.ly/3tfAlYD
    Checkout our bestselling System Design Interview books:
    Volume 1: amzn.to/3Ou7gkd
    Volume 2: amzn.to/3HqGozy
    Digital version of System Design Interview books: bit.ly/3mlDSk9
    The Secret Sauce Behind NoSQL: • The Secret Sauce Behin...
    Animation tools: Illustrator and After Effects
    ABOUT US:
    Covering topics and trends in large-scale system design, from the authors of the best-selling System Design Interview series.

КОМЕНТАРІ • 119

  • @Mutual_Information
    @Mutual_Information 2 роки тому +120

    Asking/answering the right questions, information dense and technical. I love this channel.

  • @an_other_world
    @an_other_world 2 роки тому +32

    Levelling the playing field by making info like this easily accessible to anyone

  • @MrX-nc8cm
    @MrX-nc8cm 2 роки тому +45

    Yesterday I binge watched all your videos 😂 to prepare for an interview, and then had nothing left to watch!
    I’m glad that you uploaded again, you’re videos are awesome well explained, your channel will grow fast. Keep the good work! 👏👏

    • @TheAyushSomani
      @TheAyushSomani 2 роки тому +4

      How did the interview go?

    • @MrX-nc8cm
      @MrX-nc8cm Рік тому +9

      @@TheAyushSomani sorry I didn’t see your reply. I was hired on September! After 4 months of job hunting and countless applications. I did more than 100 applications 😂 and I received 2 offers. Practicing interviews and getting rejections definitely made me improve a lot. It’s been a great experience so far on my job.

    • @jathebest2835
      @jathebest2835 Рік тому

      @@MrX-nc8cm Wow! Congrats from Korea:) What country do you live in? and I guess it's pretty competitive to get into a software engineering job there.

  • @CommandantNOVA
    @CommandantNOVA 2 роки тому +68

    If you like Bloom Filters - I highly recommend reading the SuRF : Practical Range Query Filters from CMU. It's a little more complicated but shows how you can take the same concept and apply it to eliminate arbitrary size ranges.

    • @andreikhalipau2796
      @andreikhalipau2796 Рік тому

      For someone who does not like to read there is a presentation ua-cam.com/video/OD29hZww-DM/v-deo.html

  • @emreboga
    @emreboga 2 роки тому +46

    Definitely more exciting to watch than many Netflix series 👌🏼

    • @andherium
      @andherium Рік тому +9

      Comparing a DSA tutorial to a netflix series doesn't make sense.

    • @TRoss-ru6sg
      @TRoss-ru6sg Рік тому

      Stop the cap

  • @daviddickey9832
    @daviddickey9832 7 місяців тому

    This channel is an absolute gold mine of knowledge for software devs

  • @leoxiaoyanqu
    @leoxiaoyanqu 19 днів тому

    Learned this great data structure when I was at grad school, which completely blew my mind back then. I recall the number of hash functions can also be tuned to control the probability of false positive, in addition to the length of the buckets. Great video as always! Thanks!

  • @gattuvideo
    @gattuvideo Рік тому +1

    I love how you make any topic so easy to understand. I tried reading the same topic on wikipedia and got utterly confused but the way you explained it, it clicked immediately.

  • @benoitleger-derville6986
    @benoitleger-derville6986 2 роки тому +7

    Very good realization, the animations are remarkable and very useful.

  • @kapilricky
    @kapilricky Рік тому +1

    I feel smarter everyday after watching ur design videos.

  • @ciberman
    @ciberman 2 роки тому +8

    This video is awesome. I read about Bloom filters a long time ago in wikipedia but I didn't get it at that moment. This video explains it so clearly. I love it.

  • @essamgouda1609
    @essamgouda1609 2 роки тому +3

    This channel is a gem !

  • @Athmarr
    @Athmarr Рік тому +2

    I've looked up bloom filters before and knew what they were used for but this actually made them make sense to me. Thank you so much!

  • @ibidathoillah2051
    @ibidathoillah2051 2 роки тому +1

    thank you, easy to learn, this videos should get million viewers

  • @bittu007ize
    @bittu007ize 2 роки тому

    I don't know why but my every morning starts with one of these short videos

  • @vijethkashyap151
    @vijethkashyap151 23 дні тому

    This was beautifully explained!

  • @jo42715
    @jo42715 2 роки тому +1

    Short and clear ! I love it.

  • @daniel860305
    @daniel860305 Рік тому

    Thanks for your easy-to-understand explanation! Amazing to see a data structure applied to so many different uses.

  • @TesserId
    @TesserId 2 роки тому +7

    Been fascinated with this data structure for some time. Hearing the Akamai perspective is very interesting, particularly since I manage a large proxy installation that does subscribe to an advanced rating service. Imagine needing a quick, memory efficient data structure that can give you a no/maybe answer very quickly. Sure, on the 16 proxy servers we have 128GB of RAM each. But imagine the memory requirements for virtually every FQDN on the planet.

    • @yatinarora1252
      @yatinarora1252 6 місяців тому +1

      but kinda look for false positives

    • @TesserId
      @TesserId 6 місяців тому +1

      @@yatinarora1252 It's been a while since I was taught about this, but I'm pretty sure that was said to be one of the weaknesses. It essentially caches positive findings, but lookups are called for otherwise (or maybe it's the other way around). And, collisions can accumulate (or something like that), so there has to be some way of clearing things when they get stale. So, it's not that's it perfect. It's fast and compact, but you need extra algorithmic handling to manage it's weaknesses.

  • @FLAMESpl
    @FLAMESpl 2 роки тому +2

    Great stuff, never heard of bloom filter before.

  • @yp5387
    @yp5387 2 роки тому +4

    I was confused until you explained it with the food example. And, suddenly it clicked for me. It was Ahha moment for me.

    • @DemPilafian
      @DemPilafian 2 роки тому +1

      I was actually more confused... loving pork chops must always be true!

  • @truthprevails899
    @truthprevails899 10 днів тому

    03:45 Using three hash functions is the key. It would be helpful if its highlighted in the video. Thanks for explaining the concept.

  • @JayAntoney
    @JayAntoney Місяць тому

    2 years after the video release, but browers are using cascading bloom filters to do SSL revocation checks now! Supper interesting when I learnt that.
    Steve Gibson on a podcast called security now (ep989) did a great chat on it

  • @vikasbhutra9400
    @vikasbhutra9400 2 роки тому +2

    Amazing video... Very Well Explained. !!

  • @modolief
    @modolief 2 роки тому

    Today I learned that the Bloom Filter relies on using several different hash functions (in this video the example uses 3 different hash functions). This is a very creative idea for an algorithm!

    • @CommandantNOVA
      @CommandantNOVA 2 роки тому

      This is actually a simplification for this video. Hashing from scratch is expensive. In most actual bloom filters I've seen use two hashes to produce polynomial coefficients A and B. You can then produce as many "hashes" as you want by using these coefficients without running through a whole hash function.

  • @thinhle5771
    @thinhle5771 2 роки тому

    For me, I always learn new things from this channel😅

  • @thesadanand6599
    @thesadanand6599 Рік тому

    awesome, truely knowledge delivered in byte size :)

  • @venkataramaraoemmadi378
    @venkataramaraoemmadi378 2 роки тому +1

    Big Fan of your books, Alex. These Videos are very much useful. One concern I have is.. these videos are playing like 2x fast :). You can slowdown, while explaining stuff. So our mind will get time to understand what you are saying.

    • @gcash49
      @gcash49 2 роки тому

      Just to provide a differing opinion, I found the pace of the video perfect personally

  • @iamcute69
    @iamcute69 2 роки тому +6

    The animation at 4:21 is wrong. Ribeye hashes to indexes 1, 3, and 4, but indexes 0, 3, and 4 are highlighted. The explanation is still correct, though.

  • @PaladinMansouri
    @PaladinMansouri Рік тому

    I really enjoyed that, thanks a lot

  • @rampage241
    @rampage241 Рік тому

    Very useful, thank you!

  • @MaxPicAxe
    @MaxPicAxe 2 роки тому +3

    Wow that was very interesting!

  • @tomtomsiesie5436
    @tomtomsiesie5436 2 роки тому +2

    Another great video

  • @DelShott
    @DelShott 2 роки тому

    Thanks very clear explanation, but I like lemons.

  • @ChrisHalden007
    @ChrisHalden007 2 роки тому

    Great video. Thanks

  • @BharatChandak1
    @BharatChandak1 2 роки тому +3

    Great content as usual.
    Just little pun on title of video.. 😉 What else should i know before i die ?

  • @v_iancu
    @v_iancu Рік тому +1

    Techically, you could make the bloom filter store integer numbers instead of one and zero and use increment/decrement approach when adding/removing elements.

  • @ViktorWahlberg
    @ViktorWahlberg 2 роки тому

    Nice videos you're putting out. Would have been good mentioning cascading bloom filters. Maybe out of scope for video, but just as a mention for people to do further research on their own.

  • @growie000
    @growie000 4 місяці тому

    Thank you so much.

  • @shivd862
    @shivd862 7 місяців тому

    Thank you so much !

  • @javisartdesign
    @javisartdesign 2 роки тому +1

    thanks. interesting stuff

  • @miettoisdev
    @miettoisdev 7 місяців тому

    good, good stuff right here!

  • @boringmanager9559
    @boringmanager9559 2 роки тому

    I was out for 2 weeks and when I come back, bytebytego has like 10 new videos to watch 🙂

    • @ByteByteGo
      @ByteByteGo  2 роки тому

      😂 You sure you were only gone for 2 weeks?

    • @boringmanager9559
      @boringmanager9559 2 роки тому

      @@ByteByteGo to be honest, no, not so sure 😅

  • @andreffrosa
    @andreffrosa 2 роки тому +1

    The original bloom filters don't allow removals but there are a lot of proposed improvements over the original idea and one of them allows removals by using integer buckets instead of bit buckets and incrementing/decrementing the value of each bucket (trade-off of more space to allow removals)

    • @ZorakWars
      @ZorakWars 2 роки тому +3

      What if you remove a false positive? You break the assumption that the data store cannot give false negatives.

    • @andreffrosa
      @andreffrosa 2 роки тому +2

      @@ZorakWars yes, that is true. Another trade-off you have to make.

  • @vikramadityakukreja6725
    @vikramadityakukreja6725 2 роки тому

    Excellent video

  • @hadiesmaeli6124
    @hadiesmaeli6124 Рік тому

    Thank alot, very helpfull

  • @mohammadkhalafIraqi
    @mohammadkhalafIraqi 2 роки тому

    I love this channel

  • @axa993
    @axa993 2 роки тому +2

    Awesome videos

  • @jiakai7254
    @jiakai7254 Рік тому +1

    3:51 explanation starts

  • @learnersparadise7492
    @learnersparadise7492 2 роки тому +6

    How do you create such a nice animation?

    • @devenderyadav6209
      @devenderyadav6209 2 роки тому +4

      From description : Animation tools: Illustrator and After Effects

  • @StephenGillie
    @StephenGillie 2 роки тому

    This would be a good first step to slim down a huge data set, before more complex processing on the "probably Yes" subset.

  • @DrRishabhGarg
    @DrRishabhGarg 2 роки тому +2

    Great video! can I ask something how do you make this lovely animations?

    • @ByteByteGo
      @ByteByteGo  2 роки тому +1

      It takes a team. We have some talented editors for illustration and animation, with the help of tools like Adobe After Effects, Adobe Illustrator, etc. Each video takes many hours to make.

  • @tshawtshi3040
    @tshawtshi3040 Рік тому

    You can delete if you make the buckets numbers and whe adding you increment the bucket and when deleting you decrement it

  • @ricardopassos1180
    @ricardopassos1180 Рік тому

    Bloom filters are also used in the Blockchain (EVM-based) context to query for smart contract events.

  • @krishnakeshav23
    @krishnakeshav23 Рік тому

    How is this more efficient in terms of memory access? - HashTable returns 1 location and thus O(3), but BF can return n locations, thus access time is O(n). I guess it gives huge advantage in terms memory required - to store n elements, BF will only need - log(n) locations.

  • @ReflectionOcean
    @ReflectionOcean Рік тому

    Implemented with buckets and fast hash function

  • @hosseineyvazi3122
    @hosseineyvazi3122 Рік тому

    awesome !

  • @RichardCurrie
    @RichardCurrie 2 роки тому +2

    Why are multiple hash functions required? A single one would be enough no? Having 3 seems to reduce the capacity by x3?

    • @romannasuti25
      @romannasuti25 2 роки тому +1

      IIRC, this prevents patterns from producing false positives too frequently without using much more expensive hash functions. You get to use fast, naive hash functions and still get decent space savings. Plus, it’s not really 3x storage, as elements can reassign bits and often do as the bloom filter fills up.

  • @10e999
    @10e999 2 роки тому +3

    Love that series.
    I'm interested in doing video form teaching and I wanted to know : how do you make your animation ?

    • @ByteByteGo
      @ByteByteGo  2 роки тому +8

      It takes a team. We have some talented editors for illustration and animation, with the help of tools like Adobe After Effects, Adobe Illustrator, etc. Each video takes many hours to make.

  • @JiamingFan
    @JiamingFan 9 місяців тому

    In 1:17, it should be trade off between less memory and more accurate.

  • @bamboosnow2605
    @bamboosnow2605 2 роки тому

    If the entries are not booleans but integer accumulators, then the deletion of keys can go away. This would be useful for deleting old entries.

  • @mahantalebi7192
    @mahantalebi7192 2 роки тому

    Keep up the good work! your content is just amazing and one of the best that could be found on the internet. Though the narration has some room for improvement.

  • @darklajid
    @darklajid 2 роки тому

    Tiny error in the animation. The ribeye lookup looks at 0 3 and 4 (not 1 3 and 4) according to the red highlights. But nicely done!

  • @hmls3579
    @hmls3579 8 місяців тому

    1:30 that was personal

  • @EhsanIrshad
    @EhsanIrshad 2 роки тому

    Please make video on Raft amd Paxos Algorithms as well.. pleaee

    • @ByteByteGo
      @ByteByteGo  2 роки тому +1

      Ooof... Raft is doable, but Paxos...
      Can we just do raft, please? LOL

    • @EhsanIrshad
      @EhsanIrshad 2 роки тому

      @@ByteByteGo yes please you may opt with raft, i will give it my name of paxos LOL,

    • @Chessmasteroo
      @Chessmasteroo 2 роки тому

      @@ByteByteGo and Ehsan, there an excellent paper on Paxos Made Simple 01 Nov 2001 by Leslie Lamport. It's only 14 pages. Great read.

  • @yanwugu
    @yanwugu Рік тому

    If it never forgets, will bloom filter full? eventually it will return might be true to any query.

  • @swatibhartiya7457
    @swatibhartiya7457 4 місяці тому

    @byte byte go, can you do a video on token bucket algorithm ?

  • @amertia
    @amertia Рік тому

    Thanks for this info. There is also something called cuckoo filter. Can you make a video on that as well with usecases.

  • @ryan_roga
    @ryan_roga 2 роки тому +1

    Can I please know how you made this presentation? I'd very much like to do a final presentation this semester that looks this good. 🙏

    • @yzhang2008
      @yzhang2008 2 роки тому +1

      I'm interested in how to make presentations as pretty as this one too.

    • @ByteByteGo
      @ByteByteGo  2 роки тому +4

      It takes a team. We have some talented editors for illustration and animation, with the help of tools like Adobe After Effects, Adobe Illustrator, etc. Each video takes many hours to make.

  • @bytetrip114
    @bytetrip114 Рік тому

    Can I know how to make video such like this style? What tools does Alex use?

  • @alirezakhorami
    @alirezakhorami 6 місяців тому

    Amazing, AMAIZINGNGNGNGNGNGN

  • @RahulSoniiitb
    @RahulSoniiitb 4 місяці тому

    I don't understand. in the give buckets, if he would have added 5-6 more foods of his liking then entire range would have become 1.. then it doesn't matter what you search it will always be positive. What am I missing ? is the bucket size always much larger than the dataset ?

  • @adityaanand4496
    @adityaanand4496 8 днів тому

    what I am not getting is that consider the scenario when you have run 1000 such queries, there may arise a case when all the 10 bits are set to 1, so from the 1001th query every query will return a true. Am I missing something ?

  • @crikxouba
    @crikxouba Рік тому +1

    How can it be firm no and probably yes? I mean, if it returns probably yes, and the element is not there, then that no cannot be so firm.

    • @JAlbertSG
      @JAlbertSG 6 місяців тому

      It’s all in the hash function, since it can have collisions then it’s a probably yes, if no collision is found is a definitely no

  • @akhila5858
    @akhila5858 2 роки тому

    Which software is used for these presentations

    • @ByteByteGo
      @ByteByteGo  2 роки тому

      It takes a team. We have some talented editors for illustration and animation, with the help of tools like Adobe After Effects, Adobe Illustrator, etc. Each video takes many hours to make.

  • @ReflectionOcean
    @ReflectionOcean Рік тому

    No false negative but probably false positive

  • @edgarhernandezparra7021
    @edgarhernandezparra7021 Місяць тому

    One animation is wrong, when you check the indexes for "ribeye" you're highlighting in red 0, 3 and 4 but then you make the arrows point to 1, 3 and 4

  • @Software.Engineer
    @Software.Engineer 2 роки тому +1

    🤩

  • @reactdeveloper2368
    @reactdeveloper2368 Рік тому

    If anyone has worked with mongoose implemeting bloom filters for full text search it will be nice! Thank you in advance

  • @Overthought7
    @Overthought7 2 роки тому

    Sorry, what? How is a bloom filter more space efficient than a simple set when the filter has to store 3x the bits AND give extra buffer zeroes to prevent false positives?

    • @ByteByteGo
      @ByteByteGo  2 роки тому +1

      A simple set's memory requirement scales linearly with the number of elements.
      A bloom filter's space requirement is capped at the size of the bit-set for the bloom filter itself, which is usually a fraction what a simple set would need.
      Let's go through an example.
      Say there is a set with 1 million keys, with each key hashes to a 64-bit number.
      A simple set would need 8-million bytes to just hold all the keys, ignoring collision handling and other details.
      A bloom filter with about a 1 in 100 chance of collision would need a bit-set with 10 million bits and 7 hash functions. 10 million bits is 1.25-million bytes.
      This is the main purpose of a bloom filter. We trade space for sometimes slightly inaccurate answers.

    • @Overthought7
      @Overthought7 2 роки тому

      @@ByteByteGo Thank you for taking the time! I finally figured it out on Stack-O when I read a comment saying basically, "But. A. Set. Stores. The. Keys." lol. That's just NOT how I think about a set. Why do sets store the keys anyway? Is it only for dynamic sizing? (not saying that's unimportant, rather that it's not inherent to a set - you COULD have a fixed-size set that doesn't do that)

  • @chizuru1999
    @chizuru1999 2 роки тому

    I barely managed to solve box blur and now theres more... 😂💀

  • @jackjacobs2960
    @jackjacobs2960 Рік тому

    Came for my Fujifilm left with a ?coding degree

  • @SphereofTime
    @SphereofTime Рік тому

    1:45

  • @theyruinedyoutubeagain
    @theyruinedyoutubeagain 26 днів тому +1

    Shows that you never truly know someone. I used to love this guy and think he's an amazing teacher, but now that he came out as a lemon hater I can't watch his videos in good conscience anymore. /s

  • @nateshrager512
    @nateshrager512 Рік тому

    Trade off animation doesn't exactly make sense, but great video otherwise. The trade off is between memory and accuracy, not between a lack of both.

  • @theguy5423
    @theguy5423 15 днів тому

    This is exactly how you create boring videos with much efforts

  • @shreyageek
    @shreyageek 3 місяці тому

    no one explains a single question with bloom filter

  • @sleepysheep1015
    @sleepysheep1015 Рік тому

    How does a hash function return multiple values? A hash function should only return one value for a function. It doesn't make sense. What hash functions are typically used for Bloom filters.

  • @daitedve1984
    @daitedve1984 2 роки тому

    Is it so hard to put proper description what is that filter and why we need it??