This is the single best video on the topic ever! When i was studying cs, our prof didn't even try to explain how data is stored, he just moved on to using pointers, i had no previous experience with them and was like wtf are pointers. You put it all flawlessly into words AND animations, and a picture is worth a thousand words. Great video that brings so much clarity, every cs undergrad needs to see this. Thanks a lot!
There's something about real life coaching that doesn't come near as well organized/animated videos do. All students should know that videos are 100x better at converting knowledge to intuition and they should treat in-class lectures/tutorials as the sub-materials for their learning.
@@mosantw2014 I don’t entirely disagree with you, but let me offer a counterpoint: A good instructor will have a read of the room, and will know when to move faster, or slow down and dig in to a concept. I’m trying to get back into programming after several years, the hardest part (for me) with videos is staying focused on the stuff that I learned in the 80s/90s and still remember, while waiting to get into newer stuff. Polymorphism is completely new to me. Pointers and structures like linked lists and binary trees and such are not. It’s a weird place to be.
I remember really struggling with these sorts of topics when I was at university. These are some of the best explanations for OS/low-level programming concepts I've ever come across!
Javascript bashing ✅ Engaging and interesting systems programming content ✅ Funny retorts for armchair programmers ✅ Im so glad i found this channel early and subbed
i already know pretty much everything which this channel has to offer, but its truly insane that such well formed videos on said topics exist. Wish i had these back when i was learning.
Absolute banger of a video. Rewatched like 20 times while watching, and I liked and favorited for more rewatching later. There's so much quality info here and every pause is worth it. Great stuff.
I've been working with Java for almost 20 years, and I don't think I've ever thought about what happens when you remove an element from an ArrayList. Thanks for the eye opener.
Incredible work with these videos so far. Hitting all the key points at just the right level of detail. The animation work is just... * chef's kiss * Keep it up 🙌
yes, I am to watch a livestream of yours solving CodeCrafters challenges Jon had done the same a week ago with Git, and I watched through the entire thing. that was indeed really interesting, and I'd like to solve these myself too 😊
I get, why you have so much subscribers and yet I think, you deserve way more! Explained VERY WELL so everyone can understand it! It is yet to be said, that, if you create an "array" in Javascript (type [], not {}) and insert one element at, lets say, index 1000000, any forEach (since [] and {} have different methods to call) call on that traverses the whole number of elements it should have. Had some hard time finding one misdeclaration of myself not long ago. so in my case: let testArr = []; testArr[1000000] = 'hello world'; Object.values(testArr).forEach(value => console.log(value)); was handled different than let testObj = {}; testObj[1000000] = 'hello world'; Object.values(testObj).forEach(value => console.log(value));
“This explains why we use zero instead of one for the first element” What a hero 🙌. Finally a non-stupid “programmers just count from zero” explanation
I have yet to see the combination of a linked list and array list in the wild that I was taught in my AlgoDat course and never again afterwards. It stored the data in a big array that can be relocated to grow, but also a separate mapping from indexes to array offsets. That sounds like a linked list (just with array indexes instead of full pointers) that enforces some form of memory coherence for both list nodes and data. As far as I know, you can refine this concept to a linked list of array slices, which is how text editors support efficient cutting and pasting of text.
Before this video I genuinely thought ArrayLists where LinkedLists that stored an array of pointer offsets, which you suggested, thereby eliminating the access time issues at the cost of memory. It was even going to be my goto implementation after learning of the costly operations that a simple std::vector::push_back can have all just for constant time access. But as he pointed out in this video and as I have seen in soo many others Modern CPUs were built to favour arrays. I would still choose a proper implementation of a linked list if I knew before hand that arbitrary insertion and deletion are going to be common.
George, your videos are really awesome! I already knew all these concepts but I have never seen them better explained. Anyway, I love C and Assembler because they are teaching how computers work...😊
Thank you so much for this video, excellent explaination! I have a question, though: as you showed, in languages like Rust, besides specifying the array's size, it's also necessary to specify the data type (integer, float, etc...), and from what I understood, it's because this way the compiler already knows how many bytes to read for each element. However, at 19:45, in the case of Python, how does the interpreter know if, once a pointer is dereferenced, the retrieved object is an integer, a string, or another element with indefinite length? Because according to your (beautiful) animation it seems like every object has it's own specific size.
Interpreters attach 'tags' to values in memory, so when the value is needed, it first reads the tag to identify the type of the value and know how many bytes to read. The answer is explained in my video: The size of your variables matters.
Hi, the video has been pretty interesting so far. Just a suggestion: please put the link to the previous videos you recommended. Otherwise, in a year or so, it will be much harder to find. Unfortunately, UA-cam showed exactly where the current video is in the channel's timeline.
10:36 Not sure if anybody else has commented on this, but I think the illustration is off. You have the starting address of the array as 3141, but the illustration shows 3146. It’s consistent after with what I’d expect. I’m not an expert, and mostly making sure I am following the example well. I’m actually learning a lot from this video, and appreciate all of the content.
The quality of this channel is amazing, I wish you all the success and I'm excited to see many more interesting and educative videos like yours, you have a good way of getting your point across... I'm a
12:15 not always, it can be escaped somewhere on stack too Golang stores slices both on heap and stack (depending on the size). if the size is smaller than 64kb then it stays on the stack
Hello! I need the information you put out so much! The presentation, knowledge and everything is already a 10/10 so chill! But now let me educate you! Dont give a #### about theese bad opinions; people can and will hate everything and as you get better it wont be better. What you do is so good so please continue doing it!❤
The behaviour of javascript is only theorical to match the specification, but in practice most uses of arrays are much more optimal. Each array has a type, generated code is optimized for it, and it only falls back to the generic type if you assign an out of bounds index (and all code optimized for that type is then thrown away). Even in that unoptimized state, it makes heavy use of tagged pointers, where special values mean they're storing a small int instead of a pointer, or in some cases floating point numbers can be stored, and they make use of NaN (which can take many values) to store info like it not being actually a number and holding a pointer instead, etc. For these reasons and many more, JS is way faster than Python.
About 17:45, I'm no great expert on system programming, but the severity of data locality is unlikely severe. The cost of pointer-based array instead of a template array resides in the unpredictable position of object allocation, which confuse the CPU cache prefetcher. In reality, most workload allocates objects (as each object in the containing array) closely or in a predictable fashion, so prefetching works adequately well. And of course, pointers are still grouped together as always. For example, if we add items to a list in a loop, it is trivial for the CPU prefetcher to assume the next approriate location. Hotspot specifically, each thread has its own thread heap, so as long as the array/list is not multithreaded (which is unlikely), the pattern will be maintained. Moreover, with the nature of GC, the compacting phase will very likely move spreaded objects all over the heap to a single location, both avoiding fragmentation and maintaining the fetch pattern. There are exceptions, like if a BaseType array could contain both DerivativeType1 and DerivativeType2 with completely different object layout (only possible with reference-based array), then it's difficult for the CPU to make a good sense of the fetch pattern, which will likely suffer from "data locality". But as always, the template array would also suffer from this, so it's rather an unfortunate universal technical difficulty.
The content is great. Would be interesting to see your overviews about how rust's compiler works and about compilers theory in general. As well as interpreters actually.
i recommended the first 3 videos in this series to some computer science students i was tutoring because i felt like they went in depth into these concepts, while at the same time using terms and concepts that beginner programmers are familiar with. i felt like this video used a lot more terms and concepts which might be difficult for beginner programmers to understand compared to the last three. i think this series would be better for introductory students if the smaller concepts mentioned in this video like data structures, time complexity, etc. had heir own video before having a video about dynamically sized collections
in other words i felt like the pacing in this series took a sharp turn that might be too overwhelming for me to be able to recommend it to other computer science students. judging by the pacing of the first three videos in this series, it seemed like these videos were attempting to cater toward beginner-intermediate programmers with around a year of experience, but this video didn’t come across that way, although i may be wrong in my assumption for the targeted audience of these videos
@@sa-hq8jk I think there is enough context to understand what a datatype is without giving the textbook definition of what a datatype is (which i doubt will be helpfull to anyone anyway). A definition of time complexity would probaply have been nice, it is easy to understand and aply in these cases and can also easily be googled if needed.
@@someonespotatohmm9513 i didnt mean what exactly a data type is, but more of how a struct is a type which combines other types, and how they are grouped together in memory and interpreted by the compiler and by memory
Nitpick: JavaScript engines typically do implement arrays as continuous blocks of data, and generally setting just one item at index 10k will then allocate up to that number (or more). They just have to pessimise the array for the holes in it.
14:41 that is assuming we need to preserve the order of the elements, which is not always the case. If order is not important to us, we could fill the gap with the last element of the array, making it an 0(1) operation.
What a fantastic video! Now all I want is to program in Assembly to learn how really an computer works, and to optimize all those inefficiencies those languages introduce! Great presentation 👌
that's why i propose all scripting languages should be pseudo compiled: the bytecodes are as specific as assembly instruction (not as much but you get it), and the generic stuff actually happens at "compile" time, every scripting languages should do that, even at the cost of longer "compile" time. I want to do one, but I struggle everytime when making the parser so you will probably never see that. Also in java, if it's not a primitive, it's an object, every arrays of non-primitives in java are arrays of objects, and you can verify it with the JNI.
your content is 👑. my kids will study from this channel one day 🥹 and their kids 😇 and their kids kids for generations learning low level concepts and rust. 🥂
@@michalnemecek3575 yeah but when you move the first element to the right, couldn’t you update the pointer to the base memory address of the array to now point to this newly moved location instead of the original one?
More reasons to hate JS :D (And yes to the streams) Also if you intend to expand your community on other platforms a discord server might be a good idea too.
Thanks. I had always assumed ArrayList was just some sort of alias for a Deque, but now I know, it's just a dynamic array type. Java is one of those languages that I've avoided fully learning and any language that reuses that name for a container type too. As it is now, I probably have far too much knowledge of Java.
One thing that is super cool to me is that even mobile phones are powerful enough to track bullets fired from a gun. Take PUBG mobile for example, every single bullet is its own object. not only that, but at every frame (or server tick, depending on the implementation) it has to check if the bullet collided with anything. Just imagine the computational cost of tracking every bullet in a fight of multiple squads, where every guy is spraying bullets. that's hundreds of bullets in a second or two. every bullet needs to have its location refreshed at every refresh cycle, and so does every player's location. then the collision logic needs to be run, and what's more: PUBG mobile has support for bullets going through players too. how awesome is that? That shows you how necessary gaming engines are. implementing all this from scratch would be a software architect's nightmare. with UE5 things are getting even more interesting with destructible maps. Furthermore, some games are moving towards server-side computations as the gold standard so as to make it harder to cheat. In such scenarios, deciding what data structure(s) to choose to represent the objects would be very important. To further emphasize the effect of cache on performance, it is good to know that some modern server CPUs have Gigabytes of cache.
Could someone kindly explain to me in depth and on a low-level 3:20? If ultimately all objects stored in memory (strings, ints, etc.) are just a sequence of bits at the end, how does the CPU differentiate (interpret) the binary sequence for the integer 65 and the binary sequence for the character "A"? Is there some "tag" that is associated with every variable that routes the variable to the correct processing unit within the CPU?
inherently all data that we use are indeed just a sequence of bits and bytes. the reason we have types in compiled systems languages is so that the compiler can use it to determine type information of something: the compiler can deduce the stack size, utilize packing for structs, ownership, etc. also reasoning about your code/instructions, from both the compiler and the programmer's perspective. you can think of types as a way to express something about the value associated to a name/symbol, i.e. "john" variable can contain a "Person" struct. you can also think of types as a property arising from restrictions/expectations of a data blob, i.e. think about a 7-bit character type that's actually allocated on a single byte. ultimately, there's nothing inherently low-level preventing from eliminating all types and treating everything as a generic sequence of bytes. but that is counterintuitive for the compiler and the programmer. edit: i implied this but to clarify, the processor doesn't know the type of a piece of data (well not exactly, but this is a good approximation for programmers). even instructions and pointers are data from the perspective of the processor.
also in interpreted languages, yes there are indeed tags attached to objects to keep track of the data they hold. i'm not aware if there's interpreted languages that doesn't use tags
Good point. There is no difference at all between 65 and 'A' in the memory. But, there is a certain difference in the CPU instruction set that allows that. Precisely, the CPU has instructions: add, sub (for subtraction), mul (multiplication), div (division), but also has some other special instructions called: in, out. Let us say you have the value 65 in the memory address 10. If the CPU encounters the following line 'add 10 7' while executing, it will take the content of the address 10 and that of the address 7 and simply add them. Hence, they are treated as numbers. But, if the CPU encountered that line instead 'out 10 89', it will send the content of the address 10 to the address 89. Let's say for the purpose of this example that 89 is the address of the Graphics card. Here, the Graphics card is already hard-wired to interpret every number as character. So, when receiving that 65 it will directly display the pixels forming the 'A' to the screen. Hope this simplification give you a rough picture and addresses the exact point of your question.
@@boubakersoltani4566 This was actually such an amazing answer that I wished it was even longer! It helped clarify things a lot and made interpreting binary data in memory addresses much more tangible. I thank you a lot!
Really great video, although I would have liked it if you talked about bounds checking in a normal array when you were talking about indexing out of bounds
I created a linked list in C with two levels of indirection with varying orders of magnitude up to a billion elements. However, I never got valgrind to report cache misses above 0.7% when pushing all, then accesseing all then popping all. I understand that valgrind will report a simulation of the cache rather than the actual cache, but it was the best I could do to measure because my kernel does not have perf.
It should be pointed out that the cache behavior of linked lists is NOT inherit to the linked list structure but rather to the allocator used to allocate the nodes. If we have an allocator allocators linearly the nodes will be located in memory in exact the same way as with the array. Alternative approach is to store enough elements in each node so that a full cache line is always used. Removal and addition from the middle of a node can be solved with splitting and merging. Also I am certain that pretty much all javascript interpreters really do use arrays whenever possible and only resolve to hash map as a fallback when the wasted size is too much or keys are some other type than numbers. This is not too difficult to implement internally and the performance boost is significant.
This is very important to note. I also think iv read v8 uses property access for very small and likely to not be modified arrays. This way it can do direct property access without hashmap lookup or array indexing.
17:40 how accurate is this part? I would assume the designers of Java/JVM would have definitely looked for some optimization here, seeing how often primitive (boxed) ArrayLists are used in practice. Is there really no caching of the actual values behind the references, happening behind the scenes?
20:34 Fuck I need that card in my wallet
Me too, I hope he sells it as merch one day
JS is the new PHP
🤣🤣🤣🤣🤣 that card was great lol
This is the single best video on the topic ever! When i was studying cs, our prof didn't even try to explain how data is stored, he just moved on to using pointers, i had no previous experience with them and was like wtf are pointers. You put it all flawlessly into words AND animations, and a picture is worth a thousand words. Great video that brings so much clarity, every cs undergrad needs to see this. Thanks a lot!
There's something about real life coaching that doesn't come near as well organized/animated videos do. All students should know that videos are 100x better at converting knowledge to intuition and they should treat in-class lectures/tutorials as the sub-materials for their learning.
@@mosantw2014 absolutely, well said
@@mosantw2014 I don’t entirely disagree with you, but let me offer a counterpoint: A good instructor will have a read of the room, and will know when to move faster, or slow down and dig in to a concept.
I’m trying to get back into programming after several years, the hardest part (for me) with videos is staying focused on the stuff that I learned in the 80s/90s and still remember, while waiting to get into newer stuff.
Polymorphism is completely new to me. Pointers and structures like linked lists and binary trees and such are not. It’s a weird place to be.
I remember really struggling with these sorts of topics when I was at university. These are some of the best explanations for OS/low-level programming concepts I've ever come across!
best animation quality yet, the pointer hell is somehow very understandable
pointers are easy
Javascript bashing ✅
Engaging and interesting systems programming content ✅
Funny retorts for armchair programmers ✅
Im so glad i found this channel early and subbed
absolutely one of the best channels out there right now. u go even more indepth than some of my college classes and make it seem easy. big ups bro
What a spectacular video, I'm just creating my own programming language and this fits me like a glove.
i already know pretty much everything which this channel has to offer, but its truly insane that such well formed videos on said topics exist. Wish i had these back when i was learning.
Absolute banger of a video. Rewatched like 20 times while watching, and I liked and favorited for more rewatching later.
There's so much quality info here and every pause is worth it. Great stuff.
I've been working with Java for almost 20 years, and I don't think I've ever thought about what happens when you remove an element from an ArrayList.
Thanks for the eye opener.
Me too, but with Go. Now I understand the motivation for slices vs arrays
Incredible work with these videos so far. Hitting all the key points at just the right level of detail. The animation work is just... * chef's kiss * Keep it up 🙌
Thanks!
yes, I am to watch a livestream of yours solving CodeCrafters challenges
Jon had done the same a week ago with Git, and I watched through the entire thing. that was indeed really interesting, and I'd like to solve these myself too 😊
Thanks
Just found your channel! Really happy to see you just uploaded. I love your intuitive visuals to explain all sorts of mechanics
I get, why you have so much subscribers and yet I think, you deserve way more! Explained VERY WELL so everyone can understand it!
It is yet to be said, that, if you create an "array" in Javascript (type [], not {}) and insert one element at, lets say, index 1000000, any forEach (since [] and {} have different methods to call) call on that traverses the whole number of elements it should have. Had some hard time finding one misdeclaration of myself not long ago.
so in my case:
let testArr = [];
testArr[1000000] = 'hello world';
Object.values(testArr).forEach(value => console.log(value));
was handled different than
let testObj = {};
testObj[1000000] = 'hello world';
Object.values(testObj).forEach(value => console.log(value));
I like the way you adapt people to Rust syntax with showing it in simple examples or with translations in other usual languages.
baby wake up core dumped just uploaded
🤣lmao seriously tho
I prefer 'baby wake up core dumped'
And get the kids
Let's get to the core of it you're dumped 😂😂😂😊
Waited for this video after the previous teaser. Ur videos are the most accurate on the subject there are
i love that little departure to interpreted language land
“This explains why we use zero instead of one for the first element”
What a hero 🙌. Finally a non-stupid “programmers just count from zero” explanation
Love the quality of the videos I will recommend other people in my class to them because they’re concise and easy to understand. Keep it up!
I found this person and his channel to be the best explications of complex concepts. Good job man!
Your channel is gold pls never stop making content
I have yet to see the combination of a linked list and array list in the wild that I was taught in my AlgoDat course and never again afterwards. It stored the data in a big array that can be relocated to grow, but also a separate mapping from indexes to array offsets. That sounds like a linked list (just with array indexes instead of full pointers) that enforces some form of memory coherence for both list nodes and data. As far as I know, you can refine this concept to a linked list of array slices, which is how text editors support efficient cutting and pasting of text.
Before this video I genuinely thought ArrayLists where LinkedLists that stored an array of pointer offsets, which you suggested, thereby eliminating the access time issues at the cost of memory. It was even going to be my goto implementation after learning of the costly operations that a simple std::vector::push_back can have all just for constant time access. But as he pointed out in this video and as I have seen in soo many others Modern CPUs were built to favour arrays. I would still choose a proper implementation of a linked list if I knew before hand that arbitrary insertion and deletion are going to be common.
I recommend everyone starting to understand the data structure to subscribe this channel and save this video, well done very nicely demonstrated!
I did try to use the *void pointer once! It was hilarious when you mentioned it
George, your videos are really awesome! I already knew all these concepts but I have never seen them better explained. Anyway, I love C and Assembler because they are teaching how computers work...😊
Thank you so much for this video, excellent explaination! I have a question, though: as you showed, in languages like Rust, besides specifying the array's size, it's also necessary to specify the data type (integer, float, etc...), and from what I understood, it's because this way the compiler already knows how many bytes to read for each element. However, at 19:45, in the case of Python, how does the interpreter know if, once a pointer is dereferenced, the retrieved object is an integer, a string, or another element with indefinite length? Because according to your (beautiful) animation it seems like every object has it's own specific size.
Interpreters attach 'tags' to values in memory, so when the value is needed, it first reads the tag to identify the type of the value and know how many bytes to read.
The answer is explained in my video: The size of your variables matters.
You videos are amazing! I am watching all of them!
Hi, the video has been pretty interesting so far. Just a suggestion: please put the link to the previous videos you recommended. Otherwise, in a year or so, it will be much harder to find. Unfortunately, UA-cam showed exactly where the current video is in the channel's timeline.
I absolutely adore JavaScript, but concurrently adore these videos. The quality is capital. I aspire to produce quality material like this.
12:03 did you cousin also write a getter for "self.lenght" (of self.items[self lenght]) to be the same value as "self.length" ?
HaHa vEriFuny
this content is pure gold!
I wasn't able to leave a comment on your post from yesterday but I guessed arrays and I was right! I love these deep dives
10:36
Not sure if anybody else has commented on this, but I think the illustration is off. You have the starting address of the array as 3141, but the illustration shows 3146. It’s consistent after with what I’d expect. I’m not an expert, and mostly making sure I am following the example well. I’m actually learning a lot from this video, and appreciate all of the content.
You are back🎉
Yes , really good
Heeeesss baackkk
The quality of this channel is amazing, I wish you all the success and I'm excited to see many more interesting and educative videos like yours, you have a good way of getting your point across... I'm a
One of the best videos I ever watched in my life
I love you Core Dumped, cool name, the meat of the problem from right away. Keep up the good work.
12:15 not always, it can be escaped somewhere on stack too
Golang stores slices both on heap and stack (depending on the size). if the size is smaller than 64kb then it stays on the stack
Hello! I need the information you put out so much! The presentation, knowledge and everything is already a 10/10 so chill! But now let me educate you! Dont give a #### about theese bad opinions; people can and will hate everything and as you get better it wont be better. What you do is so good so please continue doing it!❤
The behaviour of javascript is only theorical to match the specification, but in practice most uses of arrays are much more optimal. Each array has a type, generated code is optimized for it, and it only falls back to the generic type if you assign an out of bounds index (and all code optimized for that type is then thrown away). Even in that unoptimized state, it makes heavy use of tagged pointers, where special values mean they're storing a small int instead of a pointer, or in some cases floating point numbers can be stored, and they make use of NaN (which can take many values) to store info like it not being actually a number and holding a pointer instead, etc. For these reasons and many more, JS is way faster than Python.
This inspires me to go code some stuff in Rust, great animations, great explanations 👍🏻
About 17:45, I'm no great expert on system programming, but the severity of data locality is unlikely severe. The cost of pointer-based array instead of a template array resides in the unpredictable position of object allocation, which confuse the CPU cache prefetcher. In reality, most workload allocates objects (as each object in the containing array) closely or in a predictable fashion, so prefetching works adequately well. And of course, pointers are still grouped together as always.
For example, if we add items to a list in a loop, it is trivial for the CPU prefetcher to assume the next approriate location. Hotspot specifically, each thread has its own thread heap, so as long as the array/list is not multithreaded (which is unlikely), the pattern will be maintained. Moreover, with the nature of GC, the compacting phase will very likely move spreaded objects all over the heap to a single location, both avoiding fragmentation and maintaining the fetch pattern.
There are exceptions, like if a BaseType array could contain both DerivativeType1 and DerivativeType2 with completely different object layout (only possible with reference-based array), then it's difficult for the CPU to make a good sense of the fetch pattern, which will likely suffer from "data locality". But as always, the template array would also suffer from this, so it's rather an unfortunate universal technical difficulty.
This channel is about to blow up🎉
The content is great.
Would be interesting to see your overviews about how rust's compiler works and about compilers theory in general. As well as interpreters actually.
I learn so much deim your videos!! Thanks a lot !!! I'm waiting for the next one!
i recommended the first 3 videos in this series to some computer science students i was tutoring because i felt like they went in depth into these concepts, while at the same time using terms and concepts that beginner programmers are familiar with. i felt like this video used a lot more terms and concepts which might be difficult for beginner programmers to understand compared to the last three. i think this series would be better for introductory students if the smaller concepts mentioned in this video like data structures, time complexity, etc. had heir own video before having a video about dynamically sized collections
in other words i felt like the pacing in this series took a sharp turn that might be too overwhelming for me to be able to recommend it to other computer science students. judging by the pacing of the first three videos in this series, it seemed like these videos were attempting to cater toward beginner-intermediate programmers with around a year of experience, but this video didn’t come across that way, although i may be wrong in my assumption for the targeted audience of these videos
@@sa-hq8jk I think there is enough context to understand what a datatype is without giving the textbook definition of what a datatype is (which i doubt will be helpfull to anyone anyway). A definition of time complexity would probaply have been nice, it is easy to understand and aply in these cases and can also easily be googled if needed.
@@someonespotatohmm9513 i didnt mean what exactly a data type is, but more of how a struct is a type which combines other types, and how they are grouped together in memory and interpreted by the compiler and by memory
Very good video, this is the kind of teaching that works for me so thank you
yes. it would be awesome to see streams with codecrafters problem solution
Nitpick: JavaScript engines typically do implement arrays as continuous blocks of data, and generally setting just one item at index 10k will then allocate up to that number (or more). They just have to pessimise the array for the holes in it.
I remember writing a filter and it was returning null items, you have to be very careful with JS
They use C++ struct arrays, not normal arrays, class arrays or vectors.
I enjoy watching all your videos!
14:41 that is assuming we need to preserve the order of the elements, which is not always the case. If order is not important to us, we could fill the gap with the last element of the array, making it an 0(1) operation.
Amazing video. And thank you for not pedaling surfshark or some unrelated crap. Video bookmarks would be welcome!
At 19:50, where is the type of the object stored? Doesn’t python need to know that in order to know how to read the data?
you've mentioned about thinking to solve codecrafters challenges on stream.
Yes please!
The Lua Table has entered the arena.
It would be interesting to see what Lua’s cache hit & miss rate is compared with other languages…
Omg I loved this video. Super cool to know how python’s list works under the hood. Can’t wait for what you’ve got next!
Great work! Thanks for the video!
Wow, so informative, thanks so much. I’d watch a live coding session.
What a fantastic video! Now all I want is to program in Assembly to learn how really an computer works, and to optimize all those inefficiencies those languages introduce!
Great presentation 👌
Im in love with your channel 🙏🙏
that's why i propose all scripting languages should be pseudo compiled: the bytecodes are as specific as assembly instruction (not as much but you get it), and the generic stuff actually happens at "compile" time, every scripting languages should do that, even at the cost of longer "compile" time. I want to do one, but I struggle everytime when making the parser so you will probably never see that.
Also in java, if it's not a primitive, it's an object, every arrays of non-primitives in java are arrays of objects, and you can verify it with the JNI.
Amazing as always
Would love to watch those streams
Please do Hashmaps next and how are its elements linked and how does it look like in memory
Linked lists are for tape storage. Similar structures are used for block or heap storage.
Great videos, thank you for your efforts!
thank you so much man, it gave me a lot of clarity
Glad it helped!
Found great channel. Keep going!
What a gem of a channel. Keep it up!
your content is 👑. my kids will study from this channel one day 🥹 and their kids 😇 and their kids kids for generations learning low level concepts and rust. 🥂
@14:59, is it not possible in this case to move the first element to the right and then update the base memory address to its moved location?
it's possible, but then the pointer is not pointing to the base of the allocated memory, and may not free all of it (if it frees anything at all)
@@michalnemecek3575 yeah but when you move the first element to the right, couldn’t you update the pointer to the base memory address of the array to now point to this newly moved location instead of the original one?
no need to be so self-conscious at the end there. this channel is great
I think when he says ‘and so Forth’ he’s actually telling us what programming language to use.
😂
We are 2 orange S
You are very good please continue like that and I will be happy if you touch on the assembly perspective of the things too 😄
More reasons to hate JS :D
(And yes to the streams)
Also if you intend to expand your community on other platforms a discord server might be a good idea too.
Excellent videos. Love your channel!
Thanks. I had always assumed ArrayList was just some sort of alias for a Deque, but now I know, it's just a dynamic array type. Java is one of those languages that I've avoided fully learning and any language that reuses that name for a container type too. As it is now, I probably have far too much knowledge of Java.
One thing that is super cool to me is that even mobile phones are powerful enough to track bullets fired from a gun. Take PUBG mobile for example, every single bullet is its own object. not only that, but at every frame (or server tick, depending on the implementation) it has to check if the bullet collided with anything. Just imagine the computational cost of tracking every bullet in a fight of multiple squads, where every guy is spraying bullets. that's hundreds of bullets in a second or two. every bullet needs to have its location refreshed at every refresh cycle, and so does every player's location. then the collision logic needs to be run, and what's more: PUBG mobile has support for bullets going through players too. how awesome is that? That shows you how necessary gaming engines are. implementing all this from scratch would be a software architect's nightmare. with UE5 things are getting even more interesting with destructible maps. Furthermore, some games are moving towards server-side computations as the gold standard so as to make it harder to cheat. In such scenarios, deciding what data structure(s) to choose to represent the objects would be very important. To further emphasize the effect of cache on performance, it is good to know that some modern server CPUs have Gigabytes of cache.
Could someone kindly explain to me in depth and on a low-level 3:20? If ultimately all objects stored in memory (strings, ints, etc.) are just a sequence of bits at the end, how does the CPU differentiate (interpret) the binary sequence for the integer 65 and the binary sequence for the character "A"? Is there some "tag" that is associated with every variable that routes the variable to the correct processing unit within the CPU?
inherently all data that we use are indeed just a sequence of bits and bytes. the reason we have types in compiled systems languages is so that the compiler can use it to determine type information of something: the compiler can deduce the stack size, utilize packing for structs, ownership, etc. also reasoning about your code/instructions, from both the compiler and the programmer's perspective. you can think of types as a way to express something about the value associated to a name/symbol, i.e. "john" variable can contain a "Person" struct. you can also think of types as a property arising from restrictions/expectations of a data blob, i.e. think about a 7-bit character type that's actually allocated on a single byte.
ultimately, there's nothing inherently low-level preventing from eliminating all types and treating everything as a generic sequence of bytes. but that is counterintuitive for the compiler and the programmer.
edit: i implied this but to clarify, the processor doesn't know the type of a piece of data (well not exactly, but this is a good approximation for programmers). even instructions and pointers are data from the perspective of the processor.
also in interpreted languages, yes there are indeed tags attached to objects to keep track of the data they hold. i'm not aware if there's interpreted languages that doesn't use tags
Good point. There is no difference at all between 65 and 'A' in the memory. But, there is a certain difference in the CPU instruction set that allows that. Precisely, the CPU has instructions: add, sub (for subtraction), mul (multiplication), div (division), but also has some other special instructions called: in, out. Let us say you have the value 65 in the memory address 10. If the CPU encounters the following line 'add 10 7' while executing, it will take the content of the address 10 and that of the address 7 and simply add them. Hence, they are treated as numbers. But, if the CPU encountered that line instead 'out 10 89', it will send the content of the address 10 to the address 89. Let's say for the purpose of this example that 89 is the address of the Graphics card. Here, the Graphics card is already hard-wired to interpret every number as character. So, when receiving that 65 it will directly display the pixels forming the 'A' to the screen. Hope this simplification give you a rough picture and addresses the exact point of your question.
@@boubakersoltani4566 This was actually such an amazing answer that I wished it was even longer! It helped clarify things a lot and made interpreting binary data in memory addresses much more tangible. I thank you a lot!
@@Darkev77 Wish you all the success bro. Glad it helped :)
Yet another banger from project CD!
God please never stop making vids my guy AGHHHHHHHHHHH
Really great video, although I would have liked it if you talked about bounds checking in a normal array when you were talking about indexing out of bounds
Very well explained, these kinds of animations are extremely useful.
Your cousin may know more than me, but he still misspelled "length" in that code :P 11:50
I fucking love these videos. No long intro, no Indian accent, not slow and no bullshit. Please keep making this type of content
Really Amazing
Thank You so much...
What do you use to make the Animations/transitions? Amazing production and content
I created a linked list in C with two levels of indirection with varying orders of magnitude up to a billion elements. However, I never got valgrind to report cache misses above 0.7% when pushing all, then accesseing all then popping all. I understand that valgrind will report a simulation of the cache rather than the actual cache, but it was the best I could do to measure because my kernel does not have perf.
It should be pointed out that the cache behavior of linked lists is NOT inherit to the linked list structure but rather to the allocator used to allocate the nodes. If we have an allocator allocators linearly the nodes will be located in memory in exact the same way as with the array. Alternative approach is to store enough elements in each node so that a full cache line is always used. Removal and addition from the middle of a node can be solved with splitting and merging.
Also I am certain that pretty much all javascript interpreters really do use arrays whenever possible and only resolve to hash map as a fallback when the wasted size is too much or keys are some other type than numbers. This is not too difficult to implement internally and the performance boost is significant.
This is very important to note. I also think iv read v8 uses property access for very small and likely to not be modified arrays. This way it can do direct property access without hashmap lookup or array indexing.
You are doing revolutionary work bro
Keep going ,keep posting more often
May I give two thumbs up ?
Thankyou so much for these videos plz keep making them they are so good
Man this animations
Where were they for all these years?
I'm really interested in re-implementing those tools, please create those content
Thanks for the knowledge!
In Lua arrays are done the same way as in JS: they are in fact maps with values being indexed by numeric indices
17:40 how accurate is this part? I would assume the designers of Java/JVM would have definitely looked for some optimization here, seeing how often primitive (boxed) ArrayLists are used in practice. Is there really no caching of the actual values behind the references, happening behind the scenes?
Thanks...u explained well