Memory, Cache Locality, and why Arrays are Fast (Data Structures and Optimization)

Поділитися
Вставка

КОМЕНТАРІ • 389

  • @simondev758
    @simondev758  3 роки тому +85

    If you enjoyed this, help support me: www.patreon.com/simondevyt
    As for the question at the beginning, as a first hint, look at how the indices for the arrays are generated in both loops. Write out a few, and think about the pattern in memory. Actual answer below.
    Both loops touch all memory exactly once.
    First loop generates indices sequentially (ie. 0, 1, 2, 3, 4, ...), meaning memory accesses will also be sequential.
    Second loop uses a large stride (0, 4000, 8000, 12000, ...), so cache misses will be common.

    • @simondev758
      @simondev758  3 роки тому +1

      @@heinzerbrew It's the indexing, just unroll the loop a few times to confirm. Sequential, predictable memory access vs jumping around, main topic of video. Php? Don't know much about it, other than it's an interpreted language. Guessing the overhead of that drowns out other things.

    • @simondev758
      @simondev758  3 роки тому +1

      @@heinzerbrew Again, you can't use php to test this. And this has nothing to do with memory allocations. Additionally, please don't use the pinned comment for this.

    • @hww3136
      @hww3136 3 роки тому +1

      would a modern c/c++ compiler be smart enough to detect and optimize it?

    • @simondev758
      @simondev758  3 роки тому +11

      This is a pretty contrived example. I tried to think of the absolute, most basic thing that would illustrate the point without needing to build a lot of extra boilerplate code around it. But in this example, using "gcc -O3" doesn't optimize it, but gcc -Ofast does, because it disables strict standards compliance. So it does sorta catch it, but is held back because you have to explicitly tell the compiler that you don't care much about the order. Remember that floating point operations are NOT associative, so it tends to be the compiler reordering them without explicit permission from the programmer is a no-no. V8 doesn't seem to catch it, but can't say for sure why.
      So the difference in timings between the loops should still largely come down the difference in memory access patterns.
      We could make the setup a little more complex with some structs with useless info to pad them out, or an array vs linked list, or something to that effect. Maybe it would have been a better choice? Would be more legit, but harder to parse for beginners. I dunno, I'm learning how to present this as I go heh.

    • @Victor_Marius
      @Victor_Marius 2 роки тому

      @@hww3136 what if you intend to iterate that way? You'll get a bug

  • @User-ty2ml
    @User-ty2ml 3 роки тому +25

    Best Director Award goes to Simon, i have never seen such a beautiful explanation without answering question, amazing!!!!

  • @MrLazini
    @MrLazini Рік тому +13

    This channel is pure gold man. Although I already know most of these concepts, you explain them pretty clearly and in a very understandable way ! Well done friend, subbed!

  • @Chadderbox
    @Chadderbox 4 роки тому +198

    I feel like this is going to be an awesome series, especially for people like me who learnt programming by themselves and never really had a formal education teaching them about what parts of your code does under the hood.

    • @simondev758
      @simondev758  4 роки тому +94

      I did study computer science, but none of this was covered. This is mostly from years of experience learning at companies and applying these optimizations.

    • @carterearles9528
      @carterearles9528 3 роки тому +1

      Me too!

    • @Ainigma
      @Ainigma 3 роки тому +8

      @@simondev758 i share this experience. in computer science, we also never covered that. only when i had to do projects and facing performance optimization problems, i was motivated to tackle the fundamentals of data structures and how to squeeze more and more efficiancy out of my code.

    • @barddz4646
      @barddz4646 3 роки тому +6

      @@Ainigma wait really? I’m at the first semester of computer engeneering and i’ve alterady learned about this, Idk if the way they teach is different here in Brazil or not

    • @simondev758
      @simondev758  3 роки тому +19

      Different universities teach different curricula. I went to a university that's considered pretty good (ie. often included in top lists), the focus was much more on theory and mathematics. I mentored a few Stanford grads while I was at Google, came off with the same impression. Super strong bases in theory/math.
      Also, computer engineering != computer science. I can totally believe that an engineering oriented curricula would learn way more about the underlying hardware.

  • @sukkrad7797
    @sukkrad7797 3 роки тому +18

    This cleared up so many things that my professor failed to teach me (or even understand it himself)
    We truly need more videos explaining how the cpu works when we code
    Thank you so much!

    • @simondev758
      @simondev758  3 роки тому +2

      Nice, lemme know if there's other subjects I should cover too.

  • @cuteswan
    @cuteswan Рік тому +3

    I dutifully paused the video at the beginning and took _way_ too long to spot the differences between the code samples. (I've never been much of a programmer.) After getting it I still learned a lot more from your discussion on the caches and all that. Thanks.

  • @Rssks
    @Rssks 4 роки тому +119

    Wow this video is almost 10 minutes but it felt like 1 minute! I really enjoyed it (Y) will definitely rewatch this later today!

    • @simondev758
      @simondev758  4 роки тому +5

      Awesome!

    • @Provitia
      @Provitia 3 роки тому +1

      The video became x10 faster for you

    • @psychopompous489
      @psychopompous489 Рік тому

      I was jumping around randomly and it felt like 10 minutes

  • @xeridea
    @xeridea 2 роки тому +66

    You didn't explain why the original code posted was 10x slower. The reason being that the method in which the array was effectively traversed was not sequential or easily predictable, causing a lot of cache misses.

  • @eddiemuller3157
    @eddiemuller3157 4 роки тому +6

    Good Stuff, my man! The explanation is very clear and the dry humor is appreciated. I went into this video not knowing what a contiguous array was and now I have an idea of what it is.

    • @eddiemuller3157
      @eddiemuller3157 4 роки тому

      Question; I understand that the L1, L2, and L3 caches reside on the CPU itself. Is there a typically a controller on the CPU like a northbridge? Or is it the CPU itself that access the caches hence the low latency; no middle man.

    • @simondev758
      @simondev758  4 роки тому +1

      That's awesome! That's right, caches typically reside on the CPU. They're also built with a different kind of memory (SRAM), faster. Northbridge is the interface between CPU & main memory and resides on motherboard. That's my understanding anyway.

  • @AllAboutCode
    @AllAboutCode 4 роки тому +37

    4:20 you ain't cheap bruh, your content is way better than expensive internet courses.

    • @simondev758
      @simondev758  4 роки тому +4

      ty! If you have suggestions for future vids, let me know.

    • @AllAboutCode
      @AllAboutCode 4 роки тому +3

      @@simondev758I'd love this series just keep doing it

    • @valencefootball9740
      @valencefootball9740 4 роки тому +1

      hahaha weed number

    • @masterofdizzzaster
      @masterofdizzzaster 3 роки тому +2

      @@simondev758 I hope you will grow so much that you will be able to afford one of those macs without feeling its price, i watch your videos for quite some time now and you deserve it.

  • @perelmanych
    @perelmanych 3 роки тому +3

    For all those of you who are wondering why he accessed 1d array like if it is 2d, here is the real world example. Assume that in C++ you write a class that represents a matrix. You can go with 1d array or something like vector< vector< int > >, which you can think of as being 2d array. So why the hack then to use 1d array. There are many pros of using this approach. One of them is much faster work of constructor and destructor. Assume that you need to allocate memory for NxM matrix m. With 1d array it is one call of malloc (which asks the system at allocate a chunk of memory for you). With 2d array you should call malloc N times, once for each row. The same logic applies when you want to free memory. So now once we chose to represent our matrix as 1d array, assume that the user of our class wants to sum all of its elements. The user has no idea how did we choose to represent the matrix under the hood, he only knows that he can access an element of the matrix by using operator() as m(x, y). Your operator() in turn will have to calculate the position of an element by evaluating y * M + x and in the essence this will be exectly the same code as he showed in the video.

    • @simondev758
      @simondev758  3 роки тому +1

      Awesome explanation, exactly what I had in mind when making the example.

  • @osambrojevanisam3575
    @osambrojevanisam3575 4 роки тому +18

    This has got to be the best educational video on youtube, fuckimg amazing

    • @simondev758
      @simondev758  4 роки тому +4

      Awesome, glad you enjoyed it!

  • @thegibusguy4969
    @thegibusguy4969 Рік тому +2

    Hey, thanks for giving me a good explanation on what cache hits/misses were. I've seen it being tossed around but never knew what it meant until now.

  • @WojtekKozlowski1234
    @WojtekKozlowski1234 4 роки тому +512

    “why the code on the top runs faster? watch the video and find out” ...and no mention of the code again. Otherwise nice video.

    • @simondev758
      @simondev758  4 роки тому +232

      Hah yeah, I feel kinda stupid for not looping around, live and learn. I'll continue the series and make sure I actually answer it next time!

    • @metagen77
      @metagen77 4 роки тому +26

      @@simondev758 Nah you managed to explain without relying on code. If anything cut the intro.

    • @bonehelm
      @bonehelm 3 роки тому +92

      One is accessing the array in row major order, and the other is accessing it in column major order. The top one is faster because the memory is being accessed in a contiguous way, which means you'll get a lot of cache hits (very fast, going to main memory much less). The bottom is is NOT accessing memory in a contagious way, meaning every access is a cache MISS. en.wikipedia.org/wiki/Row-_and_column-major_order

    • @tagg1080
      @tagg1080 3 роки тому +10

      @@bonehelm maybe I am blind but the only thing he switched was the + and *

    • @bonehelm
      @bonehelm 3 роки тому +31

      @@tagg1080 That's exactly the only change. He's using a formula to treat a 1D array as if it were a 2D array. But the order of + and * determines whether it's accessed row by row, or column by column.

  • @tonnose
    @tonnose 3 роки тому +1

    I don't remember I watched this video before until I saw the L1/2/3 Cache example. The drawing is making the video so interesting. Now I come back for the new videos after finishing my algorithm final exam!

  • @TheBeastMaster-i2z
    @TheBeastMaster-i2z 2 місяці тому

    Amazing Video Simon never thought about why arrays are fast, in spite of using them more than a hundred times. Now, since I am working on my own project, I feel the need for them.

  • @vitalino1981
    @vitalino1981 3 роки тому +77

    "let's head to apple store. I don't actually own one, because they are super expensive and I'm super cheap, but I like to look at them" 😂🤣😆
    Man this was soooo relatable on so many levels 😂

  • @brucea9871
    @brucea9871 5 місяців тому +1

    I'm glad I'm not the only one who noticed he didn't answer the original question. I already knew most of what he discussed (how memory and the caches work) but he didn't explain how that applied to the question he posed. The reasonable answer (as I suspected before I watched the video) is that one piece of code accesses the data by rows and the other by columns. The cache would load the data by rows which would make the code that accesses the data by rows faster since there would be a lot of cache misses for the code that accessed the data by columns.

    • @lepidoptera9337
      @lepidoptera9337 5 місяців тому

      Yep, that is exactly what is going on.

    • @wulf3n773
      @wulf3n773 3 місяці тому

      Yes, It's because in the first one, for the whole time the inner loop executes, the starting address remains the same (x*4000), then we just move contiguously with the +y, which equates to +1.
      In the second for-loop, the inner loop starting address is the same, but it's not contiguous, we move in steps of (4000*y) every single iteration, which is a huge step to get to the next element.

  • @kosnk
    @kosnk 4 роки тому +1

    An amazing series start, Simon! Thank you for this dive into memory allocation/access details, very helpful! Surely looking forward to a (contiguous?) continuation and your other quick projects & cool tips! Very kind of you to put in so much work and share it. Again, thanks!

  • @mahmoudhammmad8089
    @mahmoudhammmad8089 4 роки тому +6

    AMAZING!!!
    Waiting to watch the rest of this series

  • @nathanp6928
    @nathanp6928 3 роки тому +3

    I have a final tomorrow over this stuff and this video just happened to pop into my recommended. Probably cause I was googling virtual memory, caching, etc. Super informational video, and I was able to study and be entertained at the same time. I’m a fan 👍🏼

  • @danishfayaz8077
    @danishfayaz8077 3 роки тому +1

    Absolutely loved it. Just the right way to explain a concept top to bottom. Keep it up

  • @aufkeinsten7883
    @aufkeinsten7883 3 роки тому +2

    you are made for creating videos like these, I could watch this for hours, thanks so much for the content

  • @sva3338
    @sva3338 3 роки тому +3

    Holy f. This is so good, youtube finally recommend me something that worth my time. Please, release more episode!!!

    • @simondev758
      @simondev758  3 роки тому

      Second video in the series is the linked list one.

  • @katarinazzz
    @katarinazzz 3 роки тому +4

    This is explained so clearly, I'm now going to binge watch your videos 😃 Thanks to YT for recommending me your channel. Subscribed! 🌟

  • @pist5343
    @pist5343 4 роки тому +15

    Awesome! Looking forward to this series :)
    It's really not easy teaching about memory and data structures without the audience falling asleep, but you've done a great job!
    Unfortunately JS arrays are messy and not exactly contiguous... :(

    • @simondev758
      @simondev758  4 роки тому +7

      Hah yeah, I actually had a small section on JS arrays in there just to clarify about them, but cut it due to time (and laziness).
      You're totally right, they're non-contiguous "array-like" objects. The TypedArray is closer to what you'd get in C++ or something.

  • @mahima7638
    @mahima7638 3 роки тому +1

    Loved this video! Will enjoy watching the whole series definitely.

  • @boxicool
    @boxicool 3 роки тому +2

    Man. Your channel (even i don't use js so much) is freaking awesome!! I would buy your course or book without hesitation! Thank you!

    • @simondev758
      @simondev758  3 роки тому

      No courses at the moment, I just put it all online for free :)

  • @miteshkumar5557
    @miteshkumar5557 3 роки тому +24

    Arrays are also slow when you have multiple threads, each bound to its own CPU, trying to access a global array. Suppose you have an array of size 4 int32_t, and 4 threads labeled 0 to i. The i th thread accesses the i th index in the array and modifies it repeatedly. This will cause major slowdowns as each modification will mark the cache of the other 3 threads/CPUs as modified, and each thread will have to do extra work to update their cache, even though the value that the thread cares about isn't being modified. This essentially makes the parallel code sequential. A linked list would be much better to use here since the memory locations will not be adjacent to each other, or each thread can modify a local variable and only write to the array at the end.

    • @xeridea
      @xeridea 2 роки тому +6

      I have code that uses multiple threads to generate large arrays using perlin noise, and get near perfect scaling with threads. If you were doing something that multiple threads had tons of modifications, the L3 cache would still be valid, as that is shared between cores, so it wouldn't be that big of an issue. The bigger issue would be multiple threads constantly updating the same data causing race conditions, or bizarre behavior due to how memory reads and writes work.

  • @kik8bg
    @kik8bg 3 роки тому

    A simple and interesting explation of arrays, I came here to watch this video just to make sure I knew exactly how everything works. Having hardware classes with programming coursesreally, does help out a lot to understand computers, both on hardware and software level.

  • @EpsilonKnight7
    @EpsilonKnight7 Рік тому +1

    Posting this for anyone else out there like me.
    As someone who knows JUST enough code to be dangerous, I feel dense for not getting it sooner.
    My brain absolutely looked at the two for loops, saw them moving through the same numbers, and even though I recognized there was a difference in the last line, my thought process collapsed to "alright so you move through all the same array positions eventually," and I spent ten minutes trying to figure out, on my own, why changing the order of operations would make it faster while ignoring, y'know, the actual result of iteratively running the loops.
    The first comment I saw mentioned row vs column major ordering, and even then, I thought "that's silly, it's a 1D array!" and failed to make the connection.
    Took me seeing someone list the resulting indices in order to finally click. Despite so much of video focusing on contiguousness.
    So for anyone else that did the same, we're a special kind of stupid, but at least not a unique kind.

    • @simondev758
      @simondev758  Рік тому

      It's hard, which is why optimization is often a specialty.

  • @RobertAlexanderParraCiro
    @RobertAlexanderParraCiro 3 роки тому +1

    Problably the best vide have ever watched about how arrays works

  • @als5321
    @als5321 3 роки тому +1

    This was amazing. THANK YOU. More like this for Javascript. Keep it coming and don't stop!

  • @pinekel8987
    @pinekel8987 2 роки тому +1

    Thank you so much for creating this series, this is what ( even now ) I think more programmers ( and sometimes myself ) should pay attention to.

  • @honey661
    @honey661 3 роки тому +2

    Even if I know this topic, your video was so interesting that time has passed unnoticed and I watched whole video. Wow

  • @pranjalagnihotri6072
    @pranjalagnihotri6072 3 роки тому +1

    I love how he goes into depth of things, please continue your great efforts sir 🔥

  • @Greatbubba747
    @Greatbubba747 3 роки тому

    This is a very great high-level view of arrays and CPU caching! I wish I had this video as an introduction when I was taking my Computer Systems class in college!

  • @BipinOli90
    @BipinOli90 4 роки тому +2

    Loving this series man. Please keep going. Would love to see more of real world optimization like you did previously. But yeah fundamentals need to be cleared up first.
    Btw I remember there was one cpu level hack which exploited this eager loading behavior in cpu. Some private part of memory was located by measuring the access speed, if the access was faster that meant the memory location just after current location was already cached by the cpu even though later on it was restricted to actually use due to other privacy constraints.

    • @simondev758
      @simondev758  4 роки тому +1

      More real world optimization? Sure, I can probably swing that.
      Sounds a lot like Meltdown: en.wikipedia.org/wiki/Meltdown_(security_vulnerability)

    • @BipinOli90
      @BipinOli90 4 роки тому

      @@simondev758 ah yeah, its Meltdown.
      Thanks man. Love your videos.

  • @mehmedkumali1135
    @mehmedkumali1135 3 роки тому +1

    Very helpful video, you explained everything clearly.. Reminded me of my university teacher back in the day. Thank you for the upload!

    • @simondev758
      @simondev758  3 роки тому

      Awesome, if you have suggestions on topics to cover too, let me know

  • @bakerfx4968
    @bakerfx4968 3 роки тому +2

    I wasn’t ready for it to be over!

  • @Dionny
    @Dionny 4 роки тому +50

    But why is the code example from the very start of the video faster than the other example? I feel like that wasn't covered or I missed the point.

    • @simondev758
      @simondev758  4 роки тому +22

      Youness covered it well, but I left it open as sort of an "exercise for the viewer". Kinda wanted the video to be in depth enough that you could answer it yourself.

    • @mike.emery.
      @mike.emery. 4 роки тому +43

      @@simondev758 Leaving as an exercise to the viewer is fine, but as a suggestion for the format it would be good to circle back to the question at the end of the video and call it out. The way it's structured now it seems like the question is just forgotten about.

    • @simondev758
      @simondev758  4 роки тому +34

      Great feedback Mike, 100% agree! Tried it out but next time I'll make sure to close the circle.

    • @bonehelm
      @bonehelm 3 роки тому +2

      One is accessing the array in row major order, and the other is accessing it in column major order. The top one is faster because the memory is being accessed in a contiguous way, which means you'll get a lot of cache hits (very fast, going to main memory much less). The bottom is is NOT accessing memory in a contagious way, meaning every access is a cache MISS. en.wikipedia.org/wiki/Row-_and_column-major_order

    • @TiDuZzO
      @TiDuZzO 3 роки тому

      @@bonehelm I still don't get it, both FOR starts looping on x and then y. So both should have the same speed. The only difference here is that on the first one you do the multiplication before the addition... And on the second one the other way around... And it doesn't seems to be a continuous problem here....

  • @pauljordan3064
    @pauljordan3064 3 роки тому +1

    Great explanation. Clear and concise. Thanks!

  • @shaurya9055
    @shaurya9055 3 роки тому +3

    If you'd have been my tuition teacher I'd have been a topper in my class
    Your way of explaining is so cool man👌
    I'm inspired

  • @rewrittenbytes1616
    @rewrittenbytes1616 3 роки тому +6

    Arrays are fast, except when they aren’t ;)
    Great video!

  • @tomi.mocnik
    @tomi.mocnik 3 роки тому +2

    Got it as a recommendation, instantly subscribed.

  • @JNunoCoast
    @JNunoCoast 3 роки тому +1

    Already knew these concepts but still an amazing video, definitely helpful to cs students!

  • @painomnom
    @painomnom 3 роки тому +1

    your videos deserve a way bigger audience

  • @nissieln
    @nissieln 3 роки тому +1

    Superb series! thank you for the content!

  • @closery
    @closery 3 роки тому +1

    Sir, your channel is a gem! thx for these videos

  • @binaryburnout3d
    @binaryburnout3d 2 роки тому +1

    Here king, you dropped this. . .

  • @prodsefv
    @prodsefv 2 роки тому +1

    thanks simon

  • @obsoletepowercorrupts
    @obsoletepowercorrupts 3 роки тому

    There are pros and cons to this worth noting. It can be a valid form of redundancy to have an array as a 'look-up' to speed up data coming from a query in a relational database (rather than hitting the query over and over). That helps performance sometimes. However, the cache vulnerabilities can be exploited. An array does not have the same benefits of a linked-list in terms of being able to specify pointers for pass by address versus pass by reference. Also though, in some languages a linked list can be a problem as it needs objects, and that means garbage collection (Java) can happen at anytime, thereby making performance hard to predict and sometimes rather stuttering.

  • @amimulahsan1289
    @amimulahsan1289 3 роки тому +5

    Wow, man. I mean literally you just made it simple; not like other people who go through so much technical stuff and use buzz words.

    • @simondev758
      @simondev758  3 роки тому +3

      I mentored a junior guy recently who did that, spoke entirely in jargon and buzzwords. I always wondered if he specifically practiced doing that.

  • @inconito7281
    @inconito7281 3 роки тому +1

    This video was amazing, you have my subscription

  • @yac7571
    @yac7571 2 роки тому +1

    Great job, I like your style. Easy like and sub 💯

  • @djflugel79
    @djflugel79 3 роки тому +3

    8:24 the animation made me smile

  • @raimanish4843
    @raimanish4843 4 роки тому +2

    this was very helpful. thank you

  • @gligoradrian784
    @gligoradrian784 3 роки тому +1

    Took me a good minute to find the difference between the 2 codes.

  • @shivankchopra8552
    @shivankchopra8552 3 роки тому

    Thanks for sharing this amazing tutorial!

  • @agsystems8220
    @agsystems8220 Рік тому +1

    Great video. You may have forgotten to revisit and explain in detail the code at the start though...

    • @simondev758
      @simondev758  Рік тому

      Yes... yes I did heh. Answer is pinned.

  • @ahn_buguei
    @ahn_buguei 4 роки тому +1

    Thanks! I couldn't understand cache misses or contiguous arrays before your video. It felt like 3Blue1Brown of Computer Science

    • @ahn_buguei
      @ahn_buguei 4 роки тому +2

      I just applied this on my internship and I solved a serious bottleneck!!! Thanks Simon

    • @simondev758
      @simondev758  4 роки тому

      That's really cool! What kind of problem were you solving?

    • @ahn_buguei
      @ahn_buguei 4 роки тому +1

      I optimized some existent numerical C++ code (for signal processing) from 420ms per iteration to 45ms, but the goal was 30ms per iteration. There is a giant nested for loop that was taking 44ms to run. So I applied the change specified in the video to reduce cache misses. Now it's taking 22ms. Maybe I could further optimize it by writing a special data structure (probably just a linked list), but that's good enough for the manager by now and he has other priorities for me.

  • @Chapali9a
    @Chapali9a 3 роки тому +1

    Learned a lot. Thank you

  • @mfam98
    @mfam98 3 роки тому +1

    Had a chair on advanced computer architecture, this covered some weeks on a 10 min video. Wish I had this kind of resources back then. Also had to make a simulator with several caching sizes and strategies, good times 😂

    • @simondev758
      @simondev758  3 роки тому +1

      Ooh that simulator sounds kinda fun to play with.

  • @hzrnbugsie
    @hzrnbugsie 3 роки тому +5

    My only formal training in programming is an elective course in C in college. I've loved it so much I end up writing device drivers for baremetal code running on top of ARM Cortex M processors as a hobby. There is no substitute for learning the fundamentals. It pains me to see some universities starting Computer Science students with a course in Java...

    • @simondev758
      @simondev758  3 роки тому +1

      My first exposure to programming was Java in my intro to computer science class heh :P

    • @hzrnbugsie
      @hzrnbugsie 3 роки тому +1

      Some people are just naturally good at logic and programming. We can count ourselves in that group

  • @ezequielgarrido51
    @ezequielgarrido51 3 роки тому +1

    Great series!

  • @giorgibatsiashvili4270
    @giorgibatsiashvili4270 3 роки тому +1

    i found this channel and its briliant

  • @trieutruong2410
    @trieutruong2410 3 роки тому +1

    Great video. I’m your fan for now

  • @yourfutureself4327
    @yourfutureself4327 Рік тому +1

    6:02 snipe'ns a good job mate

  • @NAm-Stars
    @NAm-Stars 4 роки тому +1

    I love this. Thank you for sharing.

  • @stLmpp
    @stLmpp 3 роки тому

    Your videos are amazing! Thank you for this content

  • @IgorSadovskii
    @IgorSadovskii 3 роки тому +1

    Omg is very useful, thanks a lot !

  • @xicheung7867
    @xicheung7867 3 роки тому

    Your content PLUS the video production = 🙌 SUBSCRIBED

  • @almazmusic
    @almazmusic 3 роки тому +1

    Okey, you got me :) awesome videos!

  • @jonasls
    @jonasls 3 роки тому +1

    I want more!! Can't wait

  • @AviPars
    @AviPars 2 роки тому +1

    Nice video ! Thanks

  • @Otakutaru
    @Otakutaru 3 роки тому +3

    why does the code on the top run 10 times faster? I thought you were going to answer it in this video. By looking at it I can deduce that it has to do with the nested loop. The first code goes in contiguous 4.000 data points, then moves onto another 4000 batch, which should be already cached. The second skips between batches, grabbing the first from all of them, then the second, etc. I don't see why it wouldn't be prefetched if it's repeteable.

    • @clearsky4049
      @clearsky4049 Рік тому

      Yup, the bottom code block is leapfrogging by 4000 positions in every iteration of the nested loop but the top code block accesses the array sequentially. Its basically the difference between grabbing one sandwich from the plate and eating it completely before moving onto the next VS taking a bite from a different sandwich every time until your plate is clean.

  • @FullThrottleTaco
    @FullThrottleTaco 2 роки тому +1

    This video kicks ass.

  • @hamsam789
    @hamsam789 3 роки тому +1

    This was really clear and easy to understand, thanks

  • @aj35lightning
    @aj35lightning 4 роки тому +1

    Are we working our way towards ecs? ...I'm getting those vibes haha.
    This also reminded me of a channel 9 talk I recently saw. "Native Code Performance and Memory: The Elephant in the CPU"
    Blew my mind the importance of cache misses and structuring your data in continuous chunks

    • @simondev758
      @simondev758  4 роки тому +1

      Oh yeah definitely want to cover ecs at some point, why major companies are going that route, etc. Lot of ground to cover between beginner and that level though heh.

    • @abdelrahman5094
      @abdelrahman5094 4 роки тому +1

      Esc stands for what?

    • @simondev758
      @simondev758  4 роки тому +2

      @@abdelrahman5094 Entity-Component-System

  • @afsarzan
    @afsarzan 3 роки тому

    love your explanation and examples..

  • @_DD_15
    @_DD_15 3 роки тому

    Nice video, thanks for that!

  • @IrwinRodriguez
    @IrwinRodriguez 3 роки тому +1

    excellent explanation! new sub!

  • @hardwired66
    @hardwired66 3 роки тому +2

    more js content sir love it❤️ thank you

  • @bwanamaina
    @bwanamaina 3 роки тому +1

    Well explained 👏👌👍

  • @DerrickJolicoeur
    @DerrickJolicoeur 3 роки тому

    Okay, a personal challenge I want to try and answer the question posed at the start before I watch further...
    Essentially, the computer compiles the code such that it saves the result from (x * 4000) until x changes (Only updating it 4000 times). Whereas (4000 * y) must be (more-or-less) recomputed every time (leading to 4000*4000 updates)

    • @simondev758
      @simondev758  3 роки тому

      I'll put a hint, but I'll put it much further down and you can expand if you want.
      The only thing that changes is the index, so the amount of work each loop does is identical. Look at the access patterns of memory instead.

  • @matthewbriggs9414
    @matthewbriggs9414 3 роки тому +1

    Takeaway: Memory caching magic allows faster access of indexes that are close. What constitutes 'close' is arbitrary, the optimization aspect is accessing indexes in order of proximity.
    Also curious as to whether 'x * 4000' is cached behind the scenes. Otherwise, pulling it into an outer loop variable should be better.

    • @arden6725
      @arden6725 Рік тому

      basically every optimizer will pull constants out of loops

  • @abdelrahman5094
    @abdelrahman5094 4 роки тому +1

    Great explanation, thanks for your effort
    I hope to see future videos on how javascript garbage collection work behind the scenes

    • @simondev758
      @simondev758  4 роки тому +1

      Ooh that'd be an interesting subject.

    • @BipinOli90
      @BipinOli90 4 роки тому +1

      I think it uses the reachability idea. If there is no way a part of memory can be reached it will be garbage collected.
      They didn't use reference count approach because that doesn't handle 2 pieces referencing each other but are completely isolated and not reachable.

  • @geekazodium
    @geekazodium Рік тому +1

    if you aren't constrained by the requirement of the array must keep it's ordering, you could also just take the value then swap in the last item in the array when deleting right?

  • @piyushjaware
    @piyushjaware 2 роки тому

    Hi Man. Very nice explanation. I was curious how you made the animations in your video.

  • @staplepin8099
    @staplepin8099 4 роки тому +2

    Amazing channel

  • @dibaterman
    @dibaterman 2 роки тому +1

    Hur hur hur, I actually understand this stuff now. =P.
    I've never used Hash before though. In my use case (targeting enemies near the player), I followed Unity's design for this; they have a new array each frame that grabs all the colliders on a specific layer around the player.
    What I did was change that to a list and cache'd the list also giving it the size for 50 since I didn't see the game having more than 20 enemies in the scene at once within a detectable range.
    Lists are slower than arrays, about 10x to 100x slower depending on what I'm doing (Or so I read). I'm hoping doing it the way I did it by initializing the list in memory with 50 spaces for things makes it faster... The result is it did, changing over enemies and the player to this system gave a marked increase in performance. =D Happy Ending.
    Why use List over the built in Unity array? Well Lists are easier to manipulate, and later on I want to put things in order of their vector 3 magnitude with the player. So plucking things from one list and adding it to the other while possible with Array to List allows for a bit more backwards checking if things are alright as I'm developing.

  • @oskardeeream1846
    @oskardeeream1846 Рік тому

    you only have to shift everything over in the array on an erase if you expect it to all stay in the same order though. in a lot of situations you don't care and can just fill in the hole with the last element and then decrease the size by 1 :D

  • @Nostrum84
    @Nostrum84 Рік тому

    Never even saw the difference in those highlighted for loops in the beginning, until I saw the answer :(

  • @ryanpark3523
    @ryanpark3523 3 роки тому

    Simon is Prometheus. Like the ancient Greek God brought fire to mankind, he delivered amazing knowledge to us. Thx again Simon for this wonderful video/lecture.

  • @ThankYouESM
    @ThankYouESM 3 роки тому

    Considering MicroSD cards can now store 2TB... I very much suspect there will soon be an ultra-array in which the cache has almost every popular combination of 255 bytes each intermixed of read-only-memory in crystal and through lasers, whereas the memory pointers will have its own buffer... sort of like how we download a UA-cam app for our smart devices to load a whole lot faster than without in requiring significantly less bandwidth plus the reloading.

  • @elgs1980
    @elgs1980 3 роки тому +1

    But did you forget to answer the question at the beginning of this video? The real question is whether a Javascript array remembers its access pointer? Does it behave like C that an array is basically a pointer?
    I copied your code and tested, the first loop is indeed significant faster, 4x faster on my machine. The only difference is in the fist loop, the array was accessed sequentially, and in the second loop, the access is kind of jumpy. So may I have the conclusion that the implementation of Javascript array stores a state of a pointer position so moving the pointer to the closer position/index is faster than moving it to a more distant position/index?

    • @simondev758
      @simondev758  3 роки тому

      Yeah I didn't answer it explicitly, figured the video did, but next time I'll have an explicit answer

  • @AndrewJeffersons
    @AndrewJeffersons 3 роки тому +1

    Love the videos, but watching using my speakers makes my table tremble. I'd love it if you would run the sound through audacity or some other sound editing software to reduce the lower sound frequencies because my subwoofer deafens your voice. It would make the sound a lot crisper, too!

    • @simondev758
      @simondev758  3 роки тому

      Interesting, I can try it out. Already process it through Audacity to clean it up/edit.

  • @vishwaratna
    @vishwaratna 3 роки тому

    I liked this channel so far. Keep making please...,🤩

  • @apoorvaranjan9921
    @apoorvaranjan9921 3 роки тому +1

    This blew my mind

  • @volkerbohm403
    @volkerbohm403 3 роки тому +1

    The video is good and correct, but I expected to hear something on arrays in JavaScript that are indeed objects. I was looking for an info how they behave in detail regarding speed. What I also missed was an information how about build-in boundary check in many languages affect array access speed.

    • @simondev758
      @simondev758  3 роки тому

      I actually had a section about JS arrays in the original script, recorded it and everything. But decided that this series wasn't JS specific and cut it.