Dijkstra's Hidden Prime Finding Algorithm

Поділитися
Вставка
  • Опубліковано 10 лют 2024
  • ⭐ Join my Patreon: / b001io
    💬 Discord: / discord
    🐦 Follow me on Twitter: / b001io
    🔗 More links: linktr.ee/b001io
    Outro background music:
    Mystic Mountain by Purrple Cat | purrplecat.com
    Music promoted by www.free-stock-music.com
    Creative Commons / Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
    creativecommons.org/licenses/...

КОМЕНТАРІ • 284

  • @yassinesafraoui
    @yassinesafraoui 2 місяці тому +361

    Dijkstra's algorithm feels like it took the division approach and was like hold on we can memorize intermediate results of the division to avoid doing it all over everytime

    • @Vaaaaadim
      @Vaaaaadim 2 місяці тому +25

      To me it looks like the sieve of eratosthenes(I'll abbreviate as SoE) approach, but instead of maintaining a boolean array, we keep track of what is the next prime multiple we would mark, and update when we need to.
      Something that makes it a little less obvious is how for SoE, the video doesn't explain that we can start at the squares of the prime. When we mark the multiples of 7 for instance, we can start marking from 49 onwards, because 14,21,28,35,42 will already be marked by multiples of 2,3,5.
      If we think of SoE as running a loop for multiples of 2,3,5,7,etc then Dijkstra's alg keeps these loops in an intermediate state and does their next iteration when necessary.

    • @slackclient
      @slackclient 2 місяці тому +14

      You guys both just proved that dijkstra successfully blended the two together @@Vaaaaadim

    • @Vaaaaadim
      @Vaaaaadim 2 місяці тому +1

      @@slackclient Would you not say that SoE already does the "memorize intermediate results of the division" on its own?

    • @slackclient
      @slackclient 2 місяці тому +3

      @@Vaaaaadim it's more of a tabulation approach than a memoization approach

  • @reeeeedmil
    @reeeeedmil 2 місяці тому +183

    i like the dijkstra's algorithm the most tbh

  • @juanmacias5922
    @juanmacias5922 2 місяці тому +160

    Dang, Dijkstra wouldn't take common consensus as the absolute, that's awesome, thanks for sharing, super interesting video!

  • @mschoenert
    @mschoenert 2 місяці тому +174

    Nice video - talking about this algorithm that is not so well known.
    Note that your implementation of Dijkstra's algorithm is different from his implementation in an important way - that has consequences for both the performance and the memory footprint.
    You put the multiples into a heap - while Dijkstra puts them into an array. So all updates take logarithmic time in your implementation, while they take constant time in Dijkstra's. It is actually a little bit more complicated, because in Dijkstra's implementation it needs to scan the array until it finds a divisor. But that doesn't cost that much time, because for most of the candidates one of the earliest primes will divide them (half of the numbers are divisible by 2, one-third are divisible by 3, etc.) So overall your implementation is slower than Dijkstra's implementation.
    And you are also putting primes and multiples into that heap that don't need to be there. If you are trying to find all the primes up to N, then you only need to put primes P with P*P

    • @prayagpatel9204
      @prayagpatel9204 2 місяці тому +9

      Great insights thanks

    • @Jack-oi8gn
      @Jack-oi8gn 2 місяці тому +2

      this was an amazing read, thank you!

    • @rchas1023
      @rchas1023 2 місяці тому +2

      I have an elementary proof (from Erdos ) that there is a prime between N and 2N, for N >= 2, and I think that 2N

    • @trueriver1950
      @trueriver1950 2 місяці тому +5

      A trivial efficiency boost is got by ignoring all even numbers. To do this we need to do two things in startup: put 2 into the list to be output, but ignore it on all the trial divisions (or whatever method). Then start the prime search at 3, incrementing always by 2.
      This simple optimization saves testing half the numbers, and adds no complexity to the loop: all the added code is in startup.
      The next optimisation is to step in multiples of 6, for each N being a multiple of six we only have to test N-1 and N+1 because the other odd number must be a multiple of three.
      The easiest way to start this is actually to dummy in 2, 3 (neither of which is ever tested) starting with N=6, where both N-1 and N+1 are not tested as we never test for divisibility by 2 or 3.
      Another option would be to dummy in 2, 3, 5, 7 and start from N=12. In a long run this saves one more iteration, which is trivial compared to saving a constant fraction. My feeling is that this is less elegant than starting from 6, and not worth the tiny saving.
      This cuts another 1/3rd off the number of primary tests, at the cost of adding door to the loop, and that we have to dummy in the primes 2 and 3 which would otherwise be skipped.
      Likewise for each prime we know up front we can cut down a further fraction of the results to be tested, testing more primes in each iteration but fewer overall. I'll leave it to you to figure out which numbers to test starting from N=30 going up in 30s (there are more than two in each iteration).
      Stepsize 6 is ready to code and worth doing. Stepsize 30 harder, and stepsize ?? harder still: at this point another optimisation kicks in: and that's programmer time vs computer run time.
      If you are paying the programmer and paying for computer time it's then a cost based decision.
      If you own the computer and are doing it for fun then that optimisation depends where complexity stops being fun. When optimisation becomes tedious let the computer run whatever stage you've got working and tested at that point...
      OR you then write a program to create the program for the next stepsize
      What you don't do is ask ChatGPT to write the program 😉

    • @trueriver1950
      @trueriver1950 2 місяці тому +1

      Pedantic correction: there is a prime between every number >1 and it's square 😊

  • @user-pw5do6tu7i
    @user-pw5do6tu7i 2 місяці тому +38

    can't wait to be the smartass who asks the interviewer to specify which dijkstras algorithm I should implement

  • @marlonmarcello
    @marlonmarcello 2 місяці тому +1

    So glad UA-cam recommended your channel, instant sub! Great content, clear and concise explanation and lovely visuals.

  • @tkwilson71
    @tkwilson71 2 місяці тому +1

    This was an excellent explanation of the different algorithms. Great job, and thank you!

  • @Grumpy6
    @Grumpy6 2 місяці тому +2

    One of the most elegant and easy to understand videos dealing with prime numbers. Congratulations!👍

  • @indrajitpal63
    @indrajitpal63 Місяць тому +1

    This is brilliant ! Thanks for sharing 😃

  • @zeus7914
    @zeus7914 2 місяці тому +3

    very nice. i hadnt heard of that method before. thanks for bringing it to light

  • @Slit1409
    @Slit1409 2 місяці тому

    Very good video, nice illustrations!

  • @raizdesamauma8607
    @raizdesamauma8607 2 місяці тому +11

    I Didn't know about this algorithm altough in college I had already researched a bit about prime algos. Thank you for the great explanation, clear breakdown of the different algorithms and the benchmarks! Great video

  • @karreevn9085
    @karreevn9085 2 місяці тому

    This kind of video is extremely useful
    Hope you'll have more videos about argorihm :))

  • @ader7836
    @ader7836 2 місяці тому

    Amazing video, great explanations!

  • @stanleydodds9
    @stanleydodds9 2 місяці тому +22

    There are several different ways to optimise the sieve of eratosthenes that massively reduce its space complexity, while keeping its time complexity the same. Note that the main benefit of this is actually to improve caching, and hence speed - memory in general isn't a problem, but not staying in the same place in memory for a while causes lookups to be much slower.
    The most obvious such approach is the segmented sieve. You divide your full range 1 through n into segments of size on the order of sqrt(n), and just sieve the blocks one at a time (with a small optimisation). The first segment is sieved as normal, to get an initial list of primes. Then for all subsequent segments, we already have all the primes needed to completely sieve them. So we only have to mark off composites using the primes we have already found, up to the square root of the largest value in the block. After that, we do one pass on the block, returning all the values that are prime (and adding them to the list of primes if needed).

    • @davidwillmore
      @davidwillmore 2 місяці тому +2

      That's what I did many years ago and it works very well. Id memory does become an issue, you can actually integrate this algorithm a third time, but at that point storage of the discovered primes becomes a significant task.

  • @sushismitcher225
    @sushismitcher225 2 місяці тому

    This is really cool!! Thank you for your very good videos, you always make really cool videos.

  • @HarryC4615
    @HarryC4615 2 місяці тому +1

    Hi b001! Your videos are really well made and clear. I wondered what software/tools do you use to create your videos and the animations. Keep up the good work.

  • @user-rd1tk6gc3v
    @user-rd1tk6gc3v 2 місяці тому

    I didn't know this algo and it seems like a great one. thanks for the video!

  • @AshishKadam
    @AshishKadam Місяць тому

    Haven't seen such quality content in years, hats off to you sir! ❤😊

  • @nexcode_ir
    @nexcode_ir 7 днів тому

    Woww, It was great. Thanks for sharing your precious knowledge

  • @victorferreira5852
    @victorferreira5852 Місяць тому

    Very interesting. I myself am a quantum computing physicist currently writing a paper for a quantum algorithm that we found to handle such prime identification tasks. Even though im very familiar with algorithms like that, i didnt know about this one from Dijkstra, so thats a really nice video!

  • @satviksrinivas8764
    @satviksrinivas8764 2 місяці тому

    Great video, thank you!

  • @ezion67
    @ezion67 2 місяці тому +4

    Your version of the "Sieve" seems to use a whole byte to store a bool instead of just one bit. That is a first huge space saving just up for grabs.
    The second is to forget about all even numbers other than 2. This can half the space the sieve needs and for trial division half the time.
    But in the long run Dijkstra still wins.

  • @lavneetjanagal
    @lavneetjanagal 2 місяці тому

    Awesome video. Thank you.

  • @kmjohnny
    @kmjohnny 2 місяці тому

    An elegant presentation of some brilliant algorithm ideas.

  • @Idan_Nesimov
    @Idan_Nesimov 2 місяці тому

    Amazing video, well done

  • @uselesscommon7761
    @uselesscommon7761 2 місяці тому

    Okay woah this is the type of content that makes me sub immediately

  • @AK-vx4dy
    @AK-vx4dy 2 місяці тому +5

    @14:42 Probably true but brave thesis, in today machines it depends, if algorithm is cache friendly it that is true but in other case "space requirement" becames "time requriment" because of slow access time to main memory

  • @brian554xx
    @brian554xx 2 місяці тому +9

    For all of these methods, you can advance by two (rather than one) at each step starting at 3. From 3, you check 5, 7, 9, 11, etc., skipping over 4, 6, 8, 10, etc. without even checking them. It's a trivial change to the code and eliminates half of the numbers you would check.
    Also, for trial division, you don't need to compare each prime to the square root. Just find the square root and start there, working downward in the list of primes.
    This is fun. I miss this stuff!

    • @Khwerz
      @Khwerz 2 місяці тому

      It's faster than sieve too, but this depends entirely on your CPU and languages array removal features. if its C its probably fast. but the only way sieve is faster is if you have a file FILLED with bools, 0/1, then you just seek to a position of the number and check if its 0 or 1.

    • @chao.m
      @chao.m 2 місяці тому

      Actually, beyond 6, you only need to test the numbers immediately surrounding a multiple of 6, eg, 11, 13, 17, 19, 23, 25, 29, 31… That way, you skip 2/3 of the natural numbers because you have skipped all multiples of 2 and 3. But it will be difficult to incorporate that into SoE or Djikstra’s algorithm

    • @przemekkobel4874
      @przemekkobel4874 2 місяці тому

      Wouldn't that break algorithm's requirement to update list of 'smallest multiples' on each tested number?

    • @kvetter
      @kvetter Місяць тому +1

      I implemented this feature of skipping even candidates--it was very good optimization. On my computer it reduced the time to find all primes under a million from 8.2 seconds to 3.8 seconds. It does break the "smallest multiple" requirement but the fix is to just update smallest multiple entry by 2*prime instead 1*prime.

    • @levieux1137
      @levieux1137 25 днів тому

      @@przemekkobel4874 yes it does. It still works to skip even values though, because for each equal value there are exactly two updates, so instead you perform your updates by adding twice the base value. But beyond that for now I failed to make it produce all good values (some are missing or some are extra depending on my tests).

  • @dipeshsamrawat7957
    @dipeshsamrawat7957 2 місяці тому

    We need more videos like this.
    Thank you so much, dear ❤

  • @ItsJoeyG
    @ItsJoeyG 2 місяці тому

    I love watching your videos then implementing them into code (without first looking at your example of course) much more fun than leetcode imo since you do a great job at explaining each algorithm. Well earned sub. Thank you :D

  • @CheezePie
    @CheezePie 2 місяці тому +1

    This is so informative! I'll try to implement Dijkatra's approach in Go.. keep making such awesome content ❤

  • @holyshit922
    @holyshit922 2 місяці тому

    When I was learning programming at high school we used this approach
    It is based on that divisors comes in pairs (for squares these numbers are repated)
    so it is not necessary to check numbers grater than square root of upper bound of interval

  • @Bolpat
    @Bolpat 2 місяці тому +3

    In the sieving algorithm, when setting multiples to false, one can start at the square of the number and then move on in increments of double number. E.g. for 5, one can skip 10, 15, and 20, and set 25, 35, and so on, to false. The first big skip works because the lower multiples are also multiples of smaller primes. The double step works because every second multiple is even and therefore not prime. Only care must be taken to special-case 2, which obviously can't use that.

  • @cppexplorery
    @cppexplorery Місяць тому +1

    This man really does cool things 💪

  • @wooooooo_Oo
    @wooooooo_Oo Місяць тому

    Great video. Just a note on the color's chosen during the Sieve of Eratosthenes' Boolean Array. I'm red/green colorblind and had a hard time seeing true and false values because the false value was not distinct enough against the background. Another option is the yellows for true wasn't bright enough for me to quickly identify.

  • @andersama2215
    @andersama2215 2 місяці тому

    The reason for using the square root is probably quicker to pick up if you consider how you're checking for primes, if you're testing for multiples (or the divisibility, however you'd like to put it) you'll be looking at two numbers A and B, where A == B or not. If we started a brute force search where A < B, the thing we should notice is that if we were to reach a case where A == B, then by definition the number we're testing can't be prime*. The second thing to notice is like you pointed out that once we continue our brute force search and flip over to where A > B, we're just observing the communicative property of multiplication at work, every A we'd be checking would have been a B we've already seen paired. So we only have to brute force up to a number which when multiplied is less than our original number.

  • @tiwarey
    @tiwarey 2 місяці тому

    I feel like I have found a goldmine! Loving your videos. Also what theme are you using?

  • @firstnamelastname-oy7es
    @firstnamelastname-oy7es 2 місяці тому +2

    Booleans usually take up a byte in computers due to the way memory is accessed at the byte level, even though they are mostly used to represent true or false values.
    There's no reason not to use bits instead of booleans if memory efficiency is the main concern of the algorithm. 30 bits fits into 4 bytes, so you would only need to use 4 booleans worth of space instead of 30.

  • @1m_abir
    @1m_abir 2 місяці тому

    thanks for sharing this amazing content
    I like Dijkstra's algo more than Prim's

  • @user-vu8ko8oj4l
    @user-vu8ko8oj4l 2 місяці тому +1

    Similar to only checking odd numbers greater than 2, when updating the next multiple of an odd prime p, add 2p not p, since one can skip over the even multiples.

  • @CrapE_DM
    @CrapE_DM 2 місяці тому

    This is the first time I've seen anyone apply the square root stopping point to the Sieve. Good to see it

  • @mandrak87
    @mandrak87 2 місяці тому +6

    Why not always skip even numbers larger than 2? No need to check them as they can never be prime (divisible by 2). Would save half the checks.

    • @durgaprasad814
      @durgaprasad814 2 місяці тому

      Computer finds a number is even by finding modules of 2.

    • @mandrak87
      @mandrak87 2 місяці тому +2

      @@durgaprasad814 True but you could use a boolean flag that flips back and forth to skip every other number.

    • @matthewp1988
      @matthewp1988 2 місяці тому

      You still have to check the other multiples to increment them. For example 12 at 12:26

    • @durgaprasad814
      @durgaprasad814 2 місяці тому +1

      @@mandrak87 yes that operation will be faster. However can we parallelis the division approach

  • @wing-it-right
    @wing-it-right 2 місяці тому

    truly a good video

  • @raxirex6443
    @raxirex6443 2 місяці тому

    good informative video 👍

  • @vietnguyen-ki3fe
    @vietnguyen-ki3fe 2 місяці тому +1

    First time heard about this algorithm.Thank you for your great video ♥

  • @Melds
    @Melds 2 місяці тому +2

    Interesting topic! Hopefully some feedback is ok. Your video was good, but I think it was a bit wordy in parts. For example, almost six minutes were spent on trial division but there was a lot of repeated parts that could have been simplified and condensed. The square root part also seemed intuitive already from factor tables, so I thought it didn't need much time.

  • @ingiford175
    @ingiford175 2 місяці тому +6

    Since square roots are hard to find, when I did this, I kept track of the squares of the various number 2 (4), 3 (9), and only started using them when they got to the number in the ().

    • @keypey8256
      @keypey8256 2 місяці тому +2

      Square roots are on modern cpus as fast as division

    • @ingiford175
      @ingiford175 2 місяці тому

      @@keypey8256 When I did this, some computers still had 8 inch floppy disks.

    • @Pluki1
      @Pluki1 2 місяці тому

      @@keypey8256how do computers calculate square roots and division?

    • @sayedaulia
      @sayedaulia 2 місяці тому

      ingford175's Algorithm

  • @iofish__
    @iofish__ 2 місяці тому +15

    Really interesting video. I didn't know about Dijkstra's method.
    As an aside trial division can be made faster by only looking at numbers which are a multiple of 6 plus or minus 1 and then you can skip checking whether the numbers are even or divisible by 3. You also don't need to do real square root calculations, just integer square roots which are a lot cheaper.
    With the sieve method you could probably save space by using int128s and flipping the bits accordingly rather than storing every true or false as an 8 bit bool.
    But I like Dijkstra's method the most

    • @luigidabro
      @luigidabro 2 місяці тому +5

      128bit processors don't exist jet, and when 64x processors deal with larger numbers than their word size, they do some very expensive conversions. So, an usize (32bit on 32x, 64bit on 64x) would probably be faster.

    • @iofish__
      @iofish__ 2 місяці тому +1

      ​@@luigidabro Ah okay I didn't know that. So an int64 would be fastest on a modern computer then

    • @luigidabro
      @luigidabro 2 місяці тому

      @iofish__ better use an unsigned value, as lshift and rshift on signed values is language dependent.

    • @jalsol
      @jalsol 2 місяці тому

      using bitset may be fast, but when combined with segmented sieve (it's the same sieve of Eratosthenes but you sieve one small segment at a time instead of the whole range at once), it turns out worse than just the segmented sieve
      CPU caching is also very important as well

  • @SamDsk
    @SamDsk 2 місяці тому

    When it comes to large primes numbers we use probabilistic approach to determine whether the given number is a prime. They are called probabilistic primality tests like Miller-Rabin algorithm.

  • @69k_gold
    @69k_gold 2 місяці тому

    Hey, amazing video as always. Just wanted to know, which software/website do you use for these animations? Is there a simple and fast software to animate if we only require text and shape animations?

  • @cthutu
    @cthutu 2 місяці тому +11

    Great video and content. But may I leave some feedback about one of your visualisation. When you were showing the Boolean array for the Sieve method, you used green for true and red for false. As a colour blind person, I struggled to see the difference between which boxes were green and which were blue. I would have suggested, using green for true, but using the original purple colour for false (i.e. not "lit").

  • @mschoenert
    @mschoenert 2 місяці тому +9

    I believe that the sieve of Eratosthenes with a sliding window is a better middle ground between the straigthforward sieve and trial division.
    The sieve with a sliding window has a memory footprint that is comparable to Dijkstra's algorithm and a performance that is comparable to the straigthforward sieve. To be more concrete - Dijkstra's algorithm requires an array for the primes and an array of multiples. The sieve with a sliding window requires those and additionally the sliding window. But already with a sliding window of a few kBytes it achieves a performance that is about as fast as the straigthforward sieve (if not faster because it is more cache friendly).
    Don't get me wrong. I am a huge fan of Dijkstra. I believe that he is one of the most brilliant minds to ever work in computer science. And his "Notes on Structured Programming (EWD 249)" are a seminal paper that brought us structured programming. And this prime generating algorithm is *just* an example that he uses to show how to derive an algorithm in this structured way. He is so brilliant that even his examples are amazingly clever.
    But from an engineering perspective, I claim that Dijkstra's algorithm is never optimal. If you only need a few hundred primes and runtime is not so important, then trial division is super simple, has a minimal memory footprint, and runs fast enough. But if you need more primes and runtime is important, then the sieve with a sliding window is better than Dijkstra's: it has basically the same memory footprint and runs much faster. How much faster really depends on the number of primes that you want to generate, but for large enough N it can be more than an order of magnitude.

  • @dominiquecolin4716
    @dominiquecolin4716 2 місяці тому

    that would be interesting to find a 'green' measure that would take into account the CO2 cost of the data center + time of calculation for each approach.
    I think we should go for such stuff, because time always seems the most important, since data storage is not perceived as an issue any longer.
    It was different from 1970 til 2000.

  • @mohamadrezamohamadi8393
    @mohamadrezamohamadi8393 2 місяці тому

    Wow, I really like these types of videos it will be so good if you add timing on videos

  • @bennettzug
    @bennettzug 2 місяці тому

    14:20 I was curious and the time complexity of the sieve of eratosthenes is O(n log log n), which i think is the first algorithm i've seen like that lol

  • @PatrickPoet
    @PatrickPoet 2 місяці тому

    fun video, thanks, I love stuff like this. your pronunciation of sieve isn't the British or American pronunciation, where is it from?

  • @abdulhamedeid935
    @abdulhamedeid935 2 місяці тому +1

    you are fuken cool man
    how are you like that ❤
    the license of your work
    the animations and visuals
    the story telling
    you are amazing i love you.
    can you also write in description the tools you use?

  • @gr.4380
    @gr.4380 2 місяці тому

    Here's the thing about space efficiency: even though most consumer apps can use more memory for the sake of performance, training AI is very space intensive, so the higher time requirement might be a better tradeoff than more memory used.

  • @emreakdeniz7524
    @emreakdeniz7524 2 місяці тому

    a brilliant video und a brilliant idea...

  • @OneTrueBadShoe
    @OneTrueBadShoe 2 місяці тому

    Dijkstra has been my favorite mathematician since 1994. Just brilliant.

  • @CortVermin
    @CortVermin 2 місяці тому

    that algorithm is plain brilliant.

  • @MI08SK
    @MI08SK Місяць тому

    It is possible to optimise the sieve of aritosta... implementation by storing booleans in one bit each.
    reducing memory by 8x

  • @thevalarauka101
    @thevalarauka101 2 місяці тому +1

    "so instead we need to increment our smallest multiple" and suddenly my brain just goes like, oh yeah, that makes sense, that's why this algorithm is better

  • @simplefahrenheit4318
    @simplefahrenheit4318 2 місяці тому +1

    Does the sieve of erasthosthanes (in modern) use units place being 1 , 3, 7 and 9 to create the initial boolean array because above 5 those are the only places of prime numbers

  • @PlasmaUI
    @PlasmaUI 2 місяці тому +22

    your videos are extremely well made, you should be getting at least 200k views on these videos
    i also prefer the sieve of eratosthenes, really fast and who cares abt space tbf

    • @manifestasisanubari
      @manifestasisanubari 2 місяці тому

      I agree that his videos are very well made and deserve more views. Just commenting this to help his video ❤

    • @mattsadventureswithart5764
      @mattsadventureswithart5764 2 місяці тому

      I don't think anyone cares about space any more, but in the days of most people having access to nothing better than 8bit micros, it was v important.

    • @vectorlua8081
      @vectorlua8081 2 місяці тому +2

      When your checking all primes up to 4 billion it matters, a lot.

    • @mschoenert
      @mschoenert 2 місяці тому +1

      @@vectorlua8081 - quite so. As others have remarked, this can be solved by using a segmented sieve (aka sieve with sliding window). My implementation computes the ~200'000'000 primes up to 4 billion in roughly eight seconds on my (not very fast) desktop.

    • @vectorlua8081
      @vectorlua8081 2 місяці тому

      @@mschoenert Nice, what language are you using? The reason I remarked on the size requirement is that not everyone has 32 GB or even 24 GB of RAM, eventually the computer would start using the swap memory and then things start getting quite slow.

  • @sergejnekrasov7688
    @sergejnekrasov7688 2 місяці тому +1

    For any human resource machine friends: Dijkstra's method becomes slower if your ALU has no mutliplication hardware :(

  • @vedicdutta2856
    @vedicdutta2856 2 місяці тому

    What would be the time complexity for this wonderful approach in Big-O notation?

  • @MusicEngineeer
    @MusicEngineeer 2 місяці тому +3

    Very interesting. I didn't know this algorithm. What is the actual space- and time-complexity of the 3 algorithms in terms of Big-O?

    • @minamagdy4126
      @minamagdy4126 2 місяці тому

      For an upper limit of n and number of primes p, the space is O(p) over the length of the output (itself O(p)), while I believe the time complexity to be O(n log(p)) with a modified min-heap (the modification is to allow getting all nodes of a value simultaneously or sequeentially, as all nodes of a value are to be raised upon coming to said value. The value will be the multiple of the prime in the tuple).

  • @MrJasbur1
    @MrJasbur1 2 місяці тому

    Also, for the Sieve of Eratosthenes, there are tricks that use the fact that the perfect squares are just sums of consecutive odd numbers starting with 1. So, like 1=1^2, 1+3=4=2^2, 1+3+5=9=3^3… and the pattern has been proven to continue forever.

    • @MrJasbur1
      @MrJasbur1 2 місяці тому

      Sorry, it’s that fact, plus the fact that you can start crossing out multiples starting with the square of the newly identified prime number in the sieve.

  • @MrJasbur1
    @MrJasbur1 2 місяці тому

    Just a minor mention on trial division. We don’t have to explicitly calculate a square root. Instead, if we only go up to the largest prime number that can be squared to get a value still less than or equal to the value being trial divided, we know that’s the last prime number that has to be checked.

  • @anon_y_mousse
    @anon_y_mousse 2 місяці тому

    I wonder if we could speed up Dijkstra's algorithm by using a multiway hash table where the primes and the multiples are both used as keys that link to each other. Could index the multiple table by a combination of the prime and its multiple to avoid conflicts where two primes have the same multiple. My usual method of using a table size that's a power of two would fit neatly here because we want to avoid division.

  • @andypyne
    @andypyne 2 місяці тому

    I like the Miller Rabin Probabilistic Primality test. That's super quick, with the caveat that it's not calculating Primes, but is very accurate at finding candidates - but for any desktop method if finding Primes, Miller Rabin is faster and as accurate. Miller Rabin breaks down at numbers higher than a desktop computer can easily calculate anyway

    • @b001
      @b001  2 місяці тому +1

      Interesting, I’ve never heard of it. I’ll look into it. Thanks for sharing!

  • @BramCohen
    @BramCohen 2 місяці тому

    In practice the memory requirements tend to be either/or. Either the requirements are below the available memory in which case who cares, or they're above it and your algorithm is swapping to disk and if its access patterns are bad enough it will finish long after you're dead.

  • @greensalad_1205
    @greensalad_1205 11 днів тому +1

    0:29 Trial of Erastosthenes

  • @_fudgepop01
    @_fudgepop01 2 місяці тому

    I think when it comes to embedded systems the dijkstra’s prime algorithm is the best one
    When space AND time are both important, with the relative limitations possible within embedded hardware, it would be the best with both of its trade offs
    …that said, at that point you might as well just calculate the list of primes externally or with intermediate steps depending on how far you need to go and just encode that as a lookup table lol

  • @EternalLoveAnkh
    @EternalLoveAnkh 2 місяці тому

    Do you have a link to his paper on this?
    RJ

  • @rutwikroutu
    @rutwikroutu 2 місяці тому +1

    Which microphone do you use ?

  • @sentinelaenow4576
    @sentinelaenow4576 2 місяці тому

    I've found Miller-Rabin primality test to be extremely efficient, even for very large numbers, like 20 digit numbers.

  • @TerjeMathisen
    @TerjeMathisen 2 місяці тому

    If you are comparing algorithmic efficiency: space vs time, then I really think you should extend your sieve with a few very natural extensions:
    a) Making the array a bit array reduces the memory needed from 8 Mbit to 1 Mbit or 125 KB.
    b) Just skipping all the even numbers (just remember that 2 was in fact prime) reduces it by another 50% to 61.75 KB, at which point it fits easily in the CPU $L2 cache.
    c) If we also skip all multiples of 3, then there are only 2 candidate numbers to check in each group of 6, i.e. at this point we only need ~41 MB.
    d) One final step, removing all multiples of 5, means that we have exactly 8 remaining candidates in each group of 30 numbers, so memory usage is down to N*8/30 bits, which for N=1e6 means just 33 KB.
    Finally, when crossing out multiples of the current prime P, we only need to start at P*P, and then the following odd multiples, until we get to the (precalculated) sqrt(N) limit.

  • @shanehebert396
    @shanehebert396 2 місяці тому

    Question: your Python code for Dijkstra's is slightly different from the description in the video, I think. In particular, in the video, for when the current number being considered is 12, you describe looping over all the pool entries where the multiple is equal to the currently considered value and updating the pool entry(ies).In the code, that's only done once (in an 'if' statement, not 'while'). When I change the 'if' to a 'while', it gets the wrong answer (I just saved the return values from each function and compare the length of Dijkstra's return with the Sieve's return and they are different (didn't check any deeper). I must be missing something.

  • @MichaelDoornbos
    @MichaelDoornbos 2 місяці тому

    Easily the best explanation of these concepts I've ever seen. Bravo.

  • @programmertheory
    @programmertheory 2 місяці тому

    I wonder. Were these algorithms optimized in any way? For example, the trial division method can be optimized for numbers greater than 3 by utilizing the fact all primes > 3 take the form of 6k +-1. Meaning assuming n > 3, we can put 2 and 3 in the primes list and then iterate over 5,7,11,13...n and not add the numbers in the list that are not prime, like 25, which is 6(4) + 1. This does improve the trial method by quite a bit.

  • @milanaanitha5506
    @milanaanitha5506 2 місяці тому

    Please make a video on finding the max contiguous subarray sum

  • @nk4g26
    @nk4g26 7 днів тому +1

    this algorithm is 1624x faster than Dijkstra's algorithm
    from numpy import arange, ndarray
    # i use arange because it faster than range and i want to return array
    def is_prime(start: int, stop: int) -> ndarray:
    if not start & 1 == 1: # In the binary system, every prime number ends with a 1, so I use a binary comparison operation (( & )) to check if the last digit is not 1.
    return arange(start+1, stop, 2) # The condition is true if the starting number is even; therefore, I add 1 to make it prime and then step by 2 to yield a prime number each time.
    return arange(start, stop, 2)
    for Example:
    is_prime(2, 10)
    [3 5 7 9]
    and this for even
    def is_even(start: int, stop: int) -> ndarray:
    if not start & 1 == 0:
    return arange(start+1, stop, 2)
    return arange(start, stop, 2)

  • @CallOfCutie69
    @CallOfCutie69 Місяць тому

    This is the moment Walt became Heisenberg Prime.

  • @YouHaventSeenMeRight
    @YouHaventSeenMeRight 2 місяці тому +1

    Since the pool also maintains a list of found primes (together with the multiples), why do we need an additional primes list?

  • @GabrielMancini1
    @GabrielMancini1 2 місяці тому

    Wow great video

  • @zyklos229
    @zyklos229 2 місяці тому

    That's the sieve, but with minimum look-ahead, which saves space.

  • @snoopy1alpha
    @snoopy1alpha 2 місяці тому

    Thanks for the video. That algorithm is great. In my opinion it is superior to both more "classic" solutions because it allows for bigger integer representations in other languages than Python (I don't know much about Python, but I would guess that it uses something like BigInteger for numbers exceeding "long"). In a language like Java or Kotlin your array size is limited to (2^31) - 1 (int). This is a very big number (a bit more than 2.1 billion), but you cannot go beyond that limit with the sieve approach (or you will consume a lot more memory for that). Arrays with "long" indexes are not supported (up to (2^63) - 1).
    Dijkstra's approach does not use arrays (or at least I would not implement it with an array). But even if you would use an array it would require less entries and you could explore the long range.
    I will try to code this in Kotlin without looking at your Python code (as an exercise). I'm not sure how far I can explore with it, but it will be an interesting challenge.
    When using the sieve with Java or Kotlin, you could use a BitSet instead of a boolean array. You are still limited to int indexes, but the space requirement is far less, because it internally maps the bits to single bits inside of long numbers inside of an array-like structure. If you explore the whole int range, it would require around 260MB instead of the > 8G you used for 1 billion in your Python program. Maybe Python has a similar data structure? It would be worth checking.

    • @snoopy1alpha
      @snoopy1alpha 2 місяці тому

      Small update from my side (I didn't even get to the Dijkstra prime variant yet):
      I first implemented the classical two algorithms in Kotlin. For some reason your trial division algorithm in Python is way more efficient than mine in Kotlin (space-wise and especially time-wise). In fact, I actually had more space problems with my trial division algorithm than I had with my sieve algorithm.
      The trial division algorithm requires you to store your result in a data structure that makes them easily accessible. I first tried a list but that was a really bad decision. In the JVM (Kotlin runs on it), lists of primitive types require you to box them as objects which heavily increases the memory consumption of every single number. So I decided to use an array of primitive integers. Arrays in the JVM require to have a fixed length. But I cannot know the exact length of the array. I could calculate an approximation using the prime number theorem. However ... the formula x/ln(x) gives you a lower bound, but I require an upper bound. The "other formula" Li(x) is beyond my ability to code in Kotlin (or any other language ... actually I cannot calculate it at all because I lack the math knowledge required to do so). So I decided to use an array of length "limit" (which is the upper bound of the range you check for primes). In doing that, I waste a lot of memory.
      For some reason my trial division variant is way slower than yours. Your example (limit 1M) took 3 seconds, mine took almost 9. At the limit of 5M yours took 25 seconds, mine took almost 3 minutes. I haven't checked your Python code yet, but I certainly will do so when I'm done with the third algorithm. Most likely I'm doing something wrong.
      Now to the sieve algorithm. I tried my BitSet-approach and it was way more successful than I initially thought. For Int.MAX_VALUE as limit (2^31 -1 ... or about 2.1 billion) my algorithm took 18.2 seconds. I cannot check the size of my data structure as easily in the JVM than you can in Python. However, I limited the maximum JVM size to 260 MB and it worked. If I go down to 256, an exception is thrown when initializing the BitSet. When trying to run the trial division algorithm with that max heap size, it runs into the exception immediately.
      With some shenanigans, I managed to use the BitSet approach with the trial division algorithm. However, this algorithm requires to iterate over the primes found so far. The BitSet is slower for this purpose (compared to iterating over an array of primes found so far), resulting in an even slower runtime for trial division.
      I am still planning on implementing the Dijkstra variant, but I don't think I can outperform the sieve algorithm with BitSet as data structure.
      I have to thank you again for this challenge. You really got me hooked :D

    • @snoopy1alpha
      @snoopy1alpha 2 місяці тому

      Second update from my side:
      I figured out my performance issue with the trial division algorithm. I messed up the square root condition for terminating the intermediate steps. I always tested all primes found so far instead of stopping at the square root of the current number. After fixing that my algorithm outperforms yours time-wise (I might have a better CPU and/or Python might be slower in general than Kotlin@JVM). I also optimized the trial division by checking only the odd numbers and leaving out the even ones entirely. In doing so, the trial division variant does pretty well (e.g., limit of 10M in 13 seconds).
      I also solved the space problem I had by actually figuring out how to approximate the pi-Function (exact number of primes up to a limit) with the logarithmic integral function (Li(x)). This worked out exceptionally well. Now I'm using up to around 460 MB for a limit of 2.1 billion (Int.MAX_VALUE or 2^31-1) instead of 8 GB.
      My BitSet variant of the sieve approach outperforms the trial division both in time and in space complexity. As mentioned before, it uses about 260 MB for its data structure.
      Now the elephant in the room: My implementation of the dijkstra prime algorithm. My first try was very underwhelming. It was way slower than my highly optimized trial division, while using up 3 times the space. I used two arrays, one for the primes found and another one for the multiples. The second array hat to be long numbers instead of int, doubling the size of the array. I certainly did something wrong. I analyzed what I did and figured out that retrieving all pairs of prime and current multiple of the minimal multiple is the expensive operation. My approach was linear in complexity resulting in this fiasco. I tried to optimize the linear search (finding a clever termination condition), but failed.
      Then I decided to try a better data structure for the "pool". I did two things. First: I don't need all primes in the pool, only those smaller than the sqrt of the limit. This significantly reduces the size of the pool data structure, independently of which one I choose. Even the multiples didn't have to be in the long range anymore. For tracking the primes, I am now using the same data structure as for the trial division approach (with the same way of predetermining the array size). For the pool I created a class containing a prime and its current multiple. Instead of an array, I put the objects of this class into a priority queue (internally it is a heap) that is automatically sorted by the value of multiples. When adding a new prime I put it into the pool if the prime is smaller than the square root of the limit. When updating a minimal multiple, I remove all minimals from the priority queue, increase their multiple value and put them back into the priority queue.
      When implementing it like that, I still was slightly disappointed at first. It was still outperformed by the trial division for fairly huge limit values. When I tried one billion as a limit the Dijkstra variant was finally significantly faster than the trial division. Either my trial division optimizations have been "too good" or the priority queue approach still has potential for improving.
      As conclusion for my findings I want to summarize my three implementations by showing you how they perform when trying to solve the limit of 2.1 billion.
      - Trial division requires 460 MB of space and around 21 minutes of runtime.
      - Sieve of Eratosthenes requires 260 MB of space and around 18 seconds of runtime.
      - Dijkstra prime requires 460 MB (primes) + 115 KB (priority queue) of space and around 8 minutes of runtime. It only uses slightly more memory than trial division but outperforms it for bigger limits by a lot.
      All three have one "expensive" operation or "flaw" that should be avoided.
      - Trial division suffers from an almost quadratic complexity and is still pretty slow, even when optimized
      - The BitSet structure, I used for sieve, has a speed disadvantage when retrieving all primes. You do not iterate over the primes but over all numbers and evaluate the associated bits for deciding which numbers are prime. That was the reason why it is inferior for being used in the trial division algorithm (see first update comment).
      - The Dijkstra approach can only be as good as the strategy used for maintaining the minimal multiple values.
      In a nutshell: the space-optimized variant of the sieve of Eratosthenes outperforms both the trial division approach and the Dijkstra approach both in space and time complexity and in my opinion may truthfully be called inferior to both... even if this counters the core message of this wonderful video. The Dijkstra approach is interesting but sadly of limited use.
      I guess now I will have a look at your Python code 😀
      Addition: I just looked at your Python code. You also used a heap/priority queue for the pool. You could optimize the heap size by stopping adding primes after the primes exceed the sqrt of your limit. Did you try higher limits? I wonder if your Dijkstra implementation is still close to sieve if you exceed 1 billion as limit (Spoiler: It isn't. For 1 billion, sieve took around 80 seconds and Dijkstra exceeded my patience 😀)

  • @hallamw
    @hallamw 2 місяці тому

    Thanks!

    • @b001
      @b001  2 місяці тому

      Thank you so much!!

  • @brandon-pacheco
    @brandon-pacheco 2 місяці тому

    Is there an option to move him aside forcefully or challenge him to Gwent?

  • @LynxaaOG
    @LynxaaOG 2 місяці тому +3

    In theory you could reduce the space complexity for the Sieve of Eratosthenes approach by using a bitset instead of booleans.

    • @Vaaaaadim
      @Vaaaaadim 2 місяці тому

      In the course I first learned about the Sieve of Eratosthenes, we were told from the get-go to implement it using a bitvector.
      Sidenote, technically, the space complexity would be exactly the same. Using a bit instead of a boolean would certainly use 8x less space or more depending on the programming language, but Big O complexity doesn't care about constant factors.
      I wonder if using a compression algorithm would improve the sieve's space efficiency, I would guess yes.

    • @LynxaaOG
      @LynxaaOG 2 місяці тому

      @@Vaaaaadim Good catch, the space complexity would be the same regardless, but the memory usage would be less with a bit vector. I mixed these two terms up.
      You could pack bits that are 1 into a structure, but at that point, that's more for storage concerns, not runtime memory performance.

    • @Vaaaaadim
      @Vaaaaadim 2 місяці тому

      @@LynxaaOG What is the difference between storage concerns and runtime memory performance?

    • @LynxaaOG
      @LynxaaOG 2 місяці тому +1

      @@Vaaaaadim I just commented that in response to you mentioning the possibility of using compression to improve the efficiency, which is typically only applicable if you were to store data on disk, while you could in theory compress the memory during runtime, you'd be adding additional performance costs to achieve that, which would hurt the algorithms performance.

  • @beepbop6697
    @beepbop6697 2 місяці тому

    All of these algorithms can be improved by just incrementing by two to skip all the even numbers. 2 being the only special case prime number that is even (all other primes are always odd).

  • @chockablock34839
    @chockablock34839 2 місяці тому +2

    I am a programmer and qualified in computer science. But I want to take a more pragmatic tact, a modern spin to this lets say. There is no need to keep calculating this beyond the limits of what our highest prime number currently is, it is purely an exercise but of no use. If you need primes, use a lookup table. Nobody needs to go beyond numbers probably in the millions. We should understand these algorithms, but not spend too much time on them. Excellent video in my opinion though. It kind of reminds me of how the greatest data compression is a hyperlink. However we still need to understand data compression.

  • @DeathSugar
    @DeathSugar 2 місяці тому

    There's more space efficient way to find primes similar to Dijkstra's algorithm called Pritchard sieve. It should use less memory than Dijkstra and has close to Eratosthenes's sieve performance.

  • @alexbf4724
    @alexbf4724 2 місяці тому

    Djikstra's algorithm is mindlblowing

  • @deatho0ne587
    @deatho0ne587 2 місяці тому

    Sqrt is normally slower than division also, which was part of Dijkstra's version also.

  • @Jetpans
    @Jetpans 2 місяці тому +2

    Would be cool to see trial division with the optimization that all primes except "2" are of form 6n-1 or 6n+1. Because that seems to me like the only algorithm that could methodicaly calculate "the largest prime number".

    • @solqr6521
      @solqr6521 2 місяці тому +2

      How exactly are you planning to use that as an optimization? The first steps of the trail division algo test module 2 and 3. If the number passed that point, then it must of the form 6n-1 or 6n+1.

    • @Jetpans
      @Jetpans 2 місяці тому +1

      @@solqr6521 you can be checking ONLY the numbers of form 6n+-1. And skip all that are not this form.
      It is only a linear optimization but I think it is pretty big. For example, instead of checking ALL the numbers 40-60, you only need to check 41 43 47 49 53 55 and 59. That comes out to a third of total checks.

    • @aboodahmad8236
      @aboodahmad8236 2 місяці тому

      You know it's simple number theory proplem?
      P≡k(mod6)
      As P is prime, k have prop(1,2,3,4,5,6)
      Can't be 2 or 4 or 6 because P will be divisible by 2 (p is not even) and in 3 case it's divisible by 3 (p is not divisible by 3) so only cases is (1,5) which is equivalent to (1,-1)
      This formula not have big impact on mathematics.

  • @stephenshipley1066
    @stephenshipley1066 2 місяці тому +1

    Very interesting but why does trhe sieve take u p so much space if it's a Boolean array? If this is composed of bits it would surely be smaller for the 1-30 example. I suspect that this is calculated using an integer array so most of the bits are wasted. Yes, bitwise operation would take longer but the Sieve of Eratosthenes has time to spare.